Object Tracking Achieved by Implementing Predictive Methods with Static Object Detectors Trained on the Single Shot Detector Inception V2 Network

University essay from Karlstads universitet/Fakulteten för hälsa, natur- och teknikvetenskap (from 2013)

Abstract: In this work, the possibility of realising object tracking by implementing predictive methods with static object detectors is explored. The static object detectors are obtained as models trained on a machine learning algorithm, or in other words, a deep neural network. Specifically, it is the single shot detector inception v2 network that will be used to train such models. Predictive methods will be incorporated to the end of improving the obtained models’ precision, i.e. their performance with respect to accuracy. Namely, Lagrangian mechanics will be employed to derived equations of motion for three different scenarios in which the object is to be tracked. These equations of motion will be implemented as predictive methods by discretising and combining them with four different iterative formulae. In ch. 1, the fundamentals of supervised machine learning, neural networks, convolutional neural networks as well as the workings of the single shot detector algorithm, approaches to hyperparameter optimisation and other relevant theory is established. This includes derivations of the relevant equations of motion and the iterative formulae with which they were implemented. In ch. 2, the experimental set-up that was utilised during data collection, and the manner by which the acquired data was used to produce training, validation and test datasets is described. This is followed by a description of how the approach of random search was used to train 64 models on 300×300 datasets, and 32 models on 512×512 datasets. Consecutively, these models are evaluated based on their performance with respect to camera-to-object distance and object velocity. In ch. 3, the trained models were verified to possess multi-scale detection capabilities, as is characteristic of models trained on the single shot detector network. While the former is found to be true irrespective of the resolution-setting of the dataset that the model has been trained on, it is found that the performance with respect to varying object velocity is significantly more consistent for the lower resolution models as they operate at a higher detection rate. Ch. 3 continues with that the implemented predictive methods are evaluated. This is done by comparing the resulting deviations when they are let to predict the missing data points from a collected detection pattern, with varying sampling percentages. It is found that the best predictive methods are those that make use of the least amount of previous data points. This followed from that the data upon which evaluations were made contained an unreasonable amount of noise, considering that the iterative formulae implemented do not take noise into account. Moreover, the lower resolution models were found to benefit more than those trained on the higher resolution datasets because of the higher detection frequency they can employ. In ch. 4, it is argued that the concept of combining predictive methods with static object detectors to the end of obtaining an object tracker is promising. Moreover, the models obtained on the single shot detector network are concluded to be good candidates for such applications. However, the predictive methods studied in this thesis should be replaced with some method that can account for noise, or be extended to be able to account for it. A profound finding is that the single shot detector inception v2 models trained on a low-resolution dataset were found to outperform those trained on a high-resolution dataset in certain regards due to the higher detection rate possible on lower resolution frames. Namely, in performance with respect to object velocity and in that predictive methods performed better on the low-resolution models.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)