Evaluation of deep learning methods for industrial automation

University essay from Umeå universitet/Institutionen för datavetenskap

Abstract: The rise and adaptation of the transformer architecture from natural language processing to visual tasks have proven a useful and powerful tool. Subsequent architectures such as visual transformers (ViT) and shifting window (SWIN) transformers have proven to be comparable and oftentimes exceed convolutional neural networks (CNNs) in terms of accuracy. However, for mobile vision tasks and limited hardware, the computational complexity of the transformer architecture is an impediment. This project aims to answer the question of whether the Swin Transformer can be adapted towards lightweight and low latency classification as a basis for industrial automation, and how it compares to CNNs for a specific task. A case study from the logging industry, binary classification of wooden boards on chain conveyors, will serve as the basis of this evaluation. For these purposes, a novel dataset has been collected and annotated. The results of this project include an overview of the respective architectures and their performance for different implementations on the classification task. Both architectures exhibited sufficient accuracy, while the CNN models performed best for the specific case study.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)