A comparison of OCR methods on natural images in different image domains

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS); KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Melvin Lundqvist; Agnes Forsberg; [2020]

Keywords: ;

Abstract: Optical character recognition (OCR) is a blanket term for methods that convert printed or handwritten text into machine-encoded text. As the digital world keeps growing the amount of digital images with text increases, and the need for OCR methods that can handle more than plain text documents as well. There are OCR engines that can convert images of clean documents with an over 99% recognition rate. OCR for natural images is getting more and more attention, but because natural images can be far more diverse than plain text documents it also leads to complications. To combat these issues it needs to be clear in what areas the OCR methods of today struggle. This thesis aims to answer this by testing three popular, readily available, OCR methods on a dataset comprised only of natural images containing text. The results show that one of the methods, GOCR, can not handle natural images as its test results were very far from correct. For the other two methods, ABBYY FineReader and Tesseract, the results were better but also show that there still is a long way to go, especially when it comes to images with special font. However when the images are less complicated some of our methods performed above our expectations.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)