Classifying hand-drawn documents in mobile settings, using transfer learning and model compression

University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

Abstract: In recent years, the state-of-the-art in computer vision has improved immensely due to increased use of convolutional neural networks (CNN). However, the best-performing models are typically complex and too slow or too large for mobile use. We investigate whether the power of these large models can be transferred to smaller models and used in mobile applications. A small CNN model was designed based on VGG Net. Using transfer learning, three pre-trained ImageNet networks were tuned to perform hand-drawn image classification. The models were evaluated on their predictive power and the best model was compressed to the small CNN model using knowledge distillation, a flavor of model compression. We found a small but significant improvement in classification performance compared to training the small CNN model directly on training data. No such improvement was found in localization abilities. We claim that model compression, and knowledge distillation in particular, presents a valuable tool for mobile deep learning development.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)