International Journal of UbiComp (IJU), Vol.6, No.3, July 2015
PERFORMANCE COMPARISON OF
and Ms. A.Sakila
Assistant Professor, Department of Computer Science, School of Computer Science and
Engineering, Bharathiar University, Coimbatore.
M.Phil Research Scholar, Department of Computer Science, School of Computer
Science and Engineering, Bharathiar University, Coimbatore.
Optical Character Recognition (OCR) is a technique, used to convert scanned image into editable text
format. Many different types of Optical Character Recognition (OCR) tools are commercially available
today; it is a useful and popular method for different types of applications. OCR can predict the accurate
result depends on text pre-processing and segmentation algorithms. Image quality is one of the most
important factors that improve quality of recognition in performing OCR tools. Images can be processed
independently (.png, .jpg, and .gif files) or in multi-page PDF documents (.pdf). The primary objective of
this work is to provide the overview of various Optical Character Recognition (OCR) tools and analyses of
their performance by applying the two factors of OCR tool performance i.e. accuracy and error rate.
Optical Character Recognition (OCR),Online OCR, Free Online OCR, OCR Convert, Convert image to
text.net, Free OCR, i2OCR, Free OCR to Word Convert, Google Docs.
Optical Character Recognition technology recognizes the text from the images automatically. It
supports different types of image formats like JPG, PNG, BMP, GIF, TIFF and multi-page PDF
files. OCR involves analysis of the captured or scanned images and then translate character
images into character codes, so that it can be edited, searched, stored more efficiently, displayed
on-line, and used in machine processes  . Scanned images can easily extract that text with the
help of different OCR Tools. It works with images that almost consist of text in it . The output
of a tool is based on the type of input image. Achieving 100% accuracy is not possible, but it is
better to have something rather than nothing . To improve accuracy most of the OCR tools use
dictionaries, recognizing individual characters then it try to recognize entire words that exist in
the selected dictionary. Sometimes it is very difficult to extract text because different font size,
style, symbols and dark background. If we are using high resolution documents the OCR tools
will produce best results. Many OCR tools are available as of now, but only a few of them are
open source and free . Normally, all the OCR tools process has five important steps. They are
preprocessing, segmentation, feature extraction, classification/recognition and post processing.
This is depicted in Figure 1.