Writer identification using semi-supervised GAN and LSR method on offline block characters

University essay from Högskolan i Halmstad/Akademin för informationsteknologi

Abstract: Block characters are often used when filling out forms, for example when writing ones personal number. The question of whether or not there is recoverable, biometric (identity related) information within individual digits of hand written personal numbers is then relevant. This thesis investigates the question by using both handcrafted features and extracting features via Deep learning (DL) models, and successively limiting the amount of available training samples. Some recent works using DL have presented semi-supervised methods using Generative adveserial network (GAN) generated data together with a modified Label smoothing regularization (LSR) function. Using this training method might improve performance on a baseline fully supervised model when doing authentication. This work additionally proposes a novel modified LSR function named Bootstrap label smooting regularizer (BLSR) designed to mitigate some of the problems of previous methods, and is compared to the others. The DL feature extraction is done by training a ResNet50 model to recognize writers of a personal numbers and then extracting the feature vector from the second to last layer of the network.Results show a clear indication of recoverable identity related information within the hand written (personal number) digits in boxes. Our results indicate an authentication performance, expressed in Equal error rate (EER), of around 25% with handcrafted features. The same performance measured in EER was between 20-30% when using the features extracted from the DL model. The DL methods, while showing potential for greater performance than the handcrafted, seem to suffer from fluctuation (noisiness) of results, making conclusions on their use in practice hard to draw. Additionally when using 1-2 training samples the handcrafted features easily beat the DL methods.When using the LSR variant semi-supervised methods there is no noticeable performance boost and BLSR gets the second best results among the alternatives.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)