Literature Study and Assessment of Trajectory Data Mining Tools
Abstract: With the development of technologies such as Global Navigation Satellite Systems (GNSS), mobile computing, and Information and Communication Technology (ICT) the procedure of sampling positional data has lately been significantly simplified. This enables the aggregation of large amounts of moving objects data (i.e. trajectories) containing potential information about the moving objects. Within Knowledge Discovery in Databases (KDD), automated processes for realization of this information, called trajectory data mining, have been implemented. The objectives of this study is to examine 1) how trajectory data mining tasks are defined at an abstract level, 2) what type of information it is possible to extract from trajectory data, 3) what solutions trajectory data mining tools implement for different tasks, 4) how tools uses visualization, and 5) what the limiting aspects of input data are how those limitations are treated. The topic, trajectory data mining, is examined in a literature review, in which a large number of academic papers found trough googling were screened to find relevant information given the above stated objectives. The literature research found that there are several challenges along the process arriving at profitable knowledge about moving objects. For example, the discrete modelling of movements as polylines is associated with an inherent uncertainty since the location between two sampled positions is unknown. To reduce this uncertainty and prepare raw data for mining, data often needs to be processed in some way. The nature of pre-processing depends on sampling rate and accuracy properties of raw in-data as well as the requirements formulated by the specific mining method. Also a major challenge is to define relevant knowledge and effective methods for extracting this from the data. Furthermore are conveying results from mining to users an important function. Presenting results in an informative way, both at the level of individual trajectories and sets of trajectories, is a vital but far from trivial task, for which visualization is an effective approach. Abstractly defined instructions for data mining are formally denoted as tasks. There are four main categories of mining tasks: 1) managing uncertainty, 2) extrapolation, 3) anomaly detection, and 4) pattern detection. The recitation of tasks within this study provides a basis for an assessment of tools used for the execution of these tasks. To arrive at profitable results the dimensions of comparison are selected with the intention to cover the essential parts of the knowledge discovery process. The measures to appraise this are chosen to make results correctly reflect the 1) sophistication, 2) user friendliness, and 3) flexibility of tools. The focus within this thesis is freely available tools, for which the range is proven to be very small and fragmented. The selection of tools found and reported on are: MoveMine 2.0, MinUS, GeT_Move and M-Atlas. The tools are reviewed entirely through utilizing documentation of the tools. The performance of tools is proved to vary along all dimensional measures except visualization and graphical user interface which all tools provide. Overall the systems preform well considering user-friendliness, somewhat good considering sophistication and poorly considering flexibility. However, since the range of tasks, which tools intend to solve, overall is varying it might not be appropriate to compare the tools in term of better or worse. This thesis further provides some theoretical insights for users regarding requirements on their knowledge, both concerning the technical aspects of tools and about the nature of the moving objects. Furthermore is the future of trajectory data mining in form of constraints on information extraction as well as requirements for development of tools discussed, where a more robust open source solution is emphasised. Finally, this thesis can altogether be regarded to provide material for guidance in what trajectory mining tools to use depending on application. Work to complement this thesis through comparing the actual performance of tools, when using them, is desirable.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)