TrajMesa: A Distributed NoSQL-Based Trajectory Data Management System


With the development of positioning technology, a large number of trajectories have been generated, which are very useful for many urban applications. However, it is challenging to manage trajectory data for its spatio-temporal dynamics and high-volume properties. Existing trajectory data management frameworks suffer from efficiency or scalability problem, and only support limited trajectory query types. This paper takes the first attempt to build a holistic distributed NoSQL trajectory storage engine, named TrajMesa, based on GeoMesa, an open-source indexing toolkit for spatio-temporal data. TrajMesa can manage a prohibitively large number of trajectories, and support plenty of query types efficiently. Specifically, we first design a novel trajectory storage schema, which reduces the storage size tremendously. We then devise a novel indexing key schema for time ranges, based on which ID temporal query can be supported efficiently. To reduce the amount of retrieved trajectory data for a spatial range query, we innovatively propose a position code to indicate the spatial location of trajectories accurately. We also propose a bunch of pruning strategies for similarity query and k-NN query in the NoSQL environment. Extensive experiments are conducted using two real datasets and one synthetic dataset, verifying the powerful query efficiency and scalability of TrajMesa.

IEEE Transactions on Knowledge and Data Engineering, 2021