Comparative Analysisof Visual Shape Featuresfor Applications to HandPose Estimation
Abstract: Being able to determine the pose of a hand is an important task for an artificial agent in order to facilitate a cognitive system. Hand pose estimation, in particular - because of its highly articulated nature, from is essential for a number of applications such as automatic sign language recognition and robot learning from demonstration. A typical essential hand model is formulated using around 30-50 degrees of freedom, implying a wide variety of possible configurations with a high degree of self occlusions leading to ambiguities and difficulties in automatic recognition. In addition, we are often interested in using a passive sensor, as a cam- era, to extract this information. These properties of hand poses warrant robust, efficient and consistent visual shape descriptors which can be utilized seamlessly for automatic hand pose estimation and hand tracking. A conducive view of the environment for its probabilis- tic modeling, is to perceive it as being controlled from an underlying unobserved latent variable. Given the observa- tions from the environment (hand images) and the features extracted from them, it is interesting to infer the state of this latent variable which controls the generating process of the data (hand pose). It becomes essential to investigate - the generative methods which produce hand images from well defined poses and the discriminative inverse problems where a hand pose need be recognized from an observed image. Central to both these paradigms is also the need to formulate a measure of goodness for comparing high dimen- sional data and separately for examining a model tailored for some data. In this project, three prototypical state-of-the-art vi- sual shape descriptors, commonly used for hand and hu- man body pose estimation are evaluated. The nature of the mappings from the hand pose space to the feature spaces spanned by the visual shape descriptors, in terms of the smoothness, discriminability, and generativity of the pose-feature mappings, as well as their robustness to noise in terms of these properties are studied. Based on this, recommendations are given on which types of applications each visual shape descriptor is suitable. Novel goodness measures are devised to quantify data similarities and to provide a scale for the performance of these visual shape descriptors. The evaluation of the experiments provides a basis for creating novel and improved models for hand pose estimation.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)