Hand Detection and Pose Estimation using Convolutional Neural Networks

University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

Abstract: This thesis examines how convolutional neural networks can applied to the problem of hand detection and hand pose estimation. Two families of convolutional neural networks are trained, aimed at performing the task of classification or regression. The networks are trained on specialized data generated from publicly available datasets. The algorithms used to generate the specialized data are also disclosed. The main focus has been to investigate the different structural properties of convolutional neural networks, not building optimized hand detection, or hand pose estimation, systems. Experiments revealed, that classifier networks featuring a relatively high number of convolutions offers the highest performance on external validation data. Additionally, shallow classifier networks featuring a relatively low number of convolutions, yields a high classification accuracy on training and testing data, but a very low accuracy on the validation set. This effect uncovers one of the fundamental difficulties in building a hand detection system: The asymmetric classification problem. In further investigation, it is also remarked, that relatively shallow classifier networks probably becomes color sensitive. Furthermore, regressor networks featuring multiscale inputs typically yielded the lowest error, when tasked with computing key-point locations directly from data. It is also revealed, that color data implicitly contain more information, making it easier to compute key-point locations, especially in the image space. However, to be able to derive the color invariant features, deeper regressor networks are required.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)