Advances in location-acquisition and wireless communication technologies have led to wider availability of spatio-temporal (ST) data. Deep neural networks (DNNs) have been successfully applied to various problems, such as computer vision, speech recognition, natural language understanding. Being different from these domains, ST data has unique spatial properties (i.e. geographical hierarchy and distance) and temporal properties (i.e. closeness, period and trend). It is very challenging to capture all of these ST properties simultaneously.