Natural image distortions and optical character recognition accuracy

University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

Author: Filip Lundqvist; Olle Wallberg; [2016]

Keywords: ;

Abstract: Current state of the art optical character recognition tools are trained using high quality image datasets. In practical applications, natural images used for character recognition willnot always be of high quality. This report examines the accuracy of a state of the art optical character recognition tool using three distorted natural image datasets. The performed distortions were lossy JPEG compression, contrast reduction and white gaussian noise injection. The accuracy is presented as an average percentage of correct and located text using the Levenshtein distance algorithm. The results indicate that white gaussian noise injection significantly reduced OCR accuracy. On the other hand, lossy JPEG compressionand contrast reduction had a similar, but less of an effect.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)