29
Chapter 2
38
Sameeksha Barve [49] has proposed optical character recognition based on
artificial neural network (ANN). The ANN is trained using the Back
Propagation algorithm. In the proposed system, each typed English letter is
represented by binary numbers that are used as input to a simple feature
extraction system whose output, in addition to the input, are fed to an ANN.
Afterwards, the Feed Forward Algorithm gives insight into the enter workings
of a neural network followed by the Back Propagation Algorithm which
compromises Training, Calculating Error, and Modifying Weights.
MdFazlul Kader and Kaushik Deb [50] have proposed artificial neural
network based simple colour and size invariant character recognition system
using feed-forward neural network to recognize English alphanumeric
characters. They use single layer feed forward neural network so there is no
hidden layer and only one type of weight, input-output weight. The input
character matrix is normalized into 12×8 matrix for size invariant recognition
and fed into the proposed network which consists of 96 input and 36 output
neurons. They have tested network by more than 20 samples per character on
average and give 99.99% accuracy only for numeric digits (0-9), 98%
accuracy only for letters (A-Z) and more than 94% accuracy for alphanumeric
characters by considering inter-class similarity measurement.
Pradeep et. al. [51] have proposed neural network based classification of
handwritten character recognition system. Each individual character is resized
to 30 X 20 pixels for processing. They are using binary features to train neural
network. However such features are not robust. In post processing stage,
recognized characters are converted to ASCII format. Input layer has 600
neurons equal to number of pixels. Output layer has 26 neurons as English has
26 alphabets. Proposed ANN uses back propagation algorithm with
momentum and adaptive learning rate [52].
28
Chapter 2
39
Velappa et.al. [53] have proposed multi scale neural network based approach.
Proposed system first convert camera captured RGB image to binary image.
Width to Height Ratio (WH), Relative Height (RH) ratio, Relative Width ratio
(RW) is calculated to remove unnecessary connected components from image.
For multi scale neural network, detected character is resized to 20 X 28 pixels,
10 X 14 pixels and 5 X 7 pixels. Binary features of these different resolution
images are given to three layer feed forward back propagation algorithm [52].
Akash Ali et al. [54] have proposed handwritten Bangla character recognition
using Back propagation Feed-forward neural network. First, Create binary
image then, extract the feature and form input vector. Then, apply the input
vector in the neural network. The experimental result shows that the proposed
recognition method gives 84% accuracy and less computational cost than other
method.
V. Kalaichelvi and Ahammed Shamir Ali [55] have proposed Application of
Neural Networks in Character Recognition. The application of neural
networks in recognizing characters from a printed script is explored. Contrast
to traditional methods of generalizing the character set, a highly specific
character set is trained for each type.
Ujjwal Bhattacharya, B. B. Chaudhuri [56] used a distinct MLP classifier.
They worked on Devanagari, Bengali and English handwritten numerals. A
back propagation (BP) algorithm was used for training the MLP classifiers. It
provided 99.27% and 99.04% recognition accuracies on the original training
and test sets of Devanagari numeral database, respectively [57].
2.2.5 Support Vector Machine
Support vector machines (SVMs also support vector networks) are a set of
related supervised learning methods used for classification.
20
Chapter 2
40
SVMs are relatively new approach compared to other supervised classification
algorithms, they are based on statistical learning theory developed by the
Russian scientist Vladimir Naumovich Vapnik back in 1962 and since then,
his original ideas have been perfected by a series of new techniques and
algorithms [58].
Support vector machines have proved to achieve good generalization
performance with no prior knowledge of the data. The principle of an SVM is
to map the input data onto a higher dimensional feature space nonlinearly
related to the input space and determine a separating hyper plane with
maximum margin between the two classes in the feature space[60] [61]. This
approach, in general, guarantees that the larger the margin is the lower is the
generalization error of the classifier [62].
If such hyper plane exists, it is clear that it provides the best separation border
between the two classes and it is known as the maximum-margin hyper plane
and such a linear classifier is known as the maximum margin classifier [62].
Figure 2.10 Separation hyper planes. H1does not separate the two classes;
H2separates but with a very tinny margin between the classes and H3separates the
two classes with much better margin than H2
24
Chapter 2
41
A support vector machine is a maximal margin hyper plane in feature space
built by using a kernel function in gene space. This results in a nonlinear
boundary in the input space. The optimal separating hyper plane can be
determined without any computations in the higher dimensional feature space
by using kernel functions in the input space [60]. Commonly used kernels
include:-
I.
Linear Kernel:
K(x, y) = x * y
II.
Radial Basis Function (Gaussian) Kernel:
K(x, y) = exp (-||x – y||2/2σ2)
III. Polynomial Kernel:
K(x, y) = (x * y + 1) d
For multi-class classification, binary SVMs are combined in either one
against-All or one-against-one (pair wise) scheme.
I.
One against All
The “one against all” strategy consists of constructing one SVM per class to
separate members of that class from members of other classes. Usually,
classification of an unknown pattern is done according to the maximum output
among all SVMs.
8
pport Vector Machine
osed a recognition model for English handwritten
letter) character recognition that uses Freeman
representation technique of an image character.
VM) has been chosen for the classification. The
, built from SVM classifiers was efficient enough
Figure 2.12 Diagram of binary One against One region boundaries on a basic
problem
28
Chapter 2
43
Shailedra Kumar Shrivastava and Sanjay S. Gharde [66] have proposed
Support Vector Machine for Handwritten Devanagari Numeral Recognition.
Binary classification techniques of Support Vector Machine is implemented
and linear kernel function is used in SVM. This linear SVM produces 99.48%
overall recognition rate.
Munish Kumar, M. K. Jindal and R. K. Sharma [67] have proposed Support
Vector Machine for Handwritten Gurumukhi characters. The classifier that has
been employed is SVM with three flavors, i.e., SVM with linear kernel, SVM
with polynomial kernel and SVM with RBF kernel. The features have been
inputted to the classifiers individually and have also been inputted
simultaneously. The recognition rate achieved by proposed system is 94.29%.
Parveen Kumar, Nitin Sharma and Arun Rana [68] have proposed handwritten
character recognition system using SVM. For the SVM classifier, recognition
model is divided in two phases namely, training and testing phase. In the
training phase 25 features are extracted from each character and these features
are used to train the SVM. In the testing phase SVM classifier is used to
recognize the characters. Recognition rate using linear function is 94.8%.
Anshuman Sharma [57] has proposed handwritten digit Recognition using
Support Vector Machine. In this proposed work the SVM (binary classifier) is
applied to multi class numeral recognition problem by using one-versus-rest
type method. The SVM is trained with the training samples using linear
kernel.
Pritpal Singh and Sumit Budhiraja [19] have proposed SVM algorithm to
recognise handwritten Gurumukhi script. They use Radial Basis Function
(Gaussian) Kernel and Polynomial Kernel and achieved 73.02% and 95.04%
recognition rate respectively.
29
Chapter 2
44
ZHAO Bin et. al. [69] have used Support Vector Machine for classification
and its Application in Chinese check recognition system. The experiment on
NIST numeral database and the actual check samples shows that comparison
with other classifiers, SVM possesses better generalization ability. They have
got 99.95% accuracy [70].
Gita Sinha et.al. [70] have proposed SVM for Handwritten Gurumukhi
Character Recognition. In the proposed system Radial Basis Function kernel is
used and they achieved 95.11% recognition rate.
N. Shanthi and K. Duraiswamy [71] have proposed a recognition system for
offline unconstrained handwritten Tamil characters based on support vector
machine. Due to the difficulty in great variation among handwritten
characters, the system is trained with 106 characters and tested for 34 selected
Tamil characters. The characters are chosen such that the sample data set
represents almost all the characters. Pixel densities are calculated for different
zones of the image and these values are used as the features of a character.
These features are used to train and test the support vector machine [72].
In [73] authors propose the Support Vector Machine (SVM) based recognition
scheme towards the recognition of Gujarati handwritten numerals. A technique
based on affine invariant moments for feature extraction is applied and the
recognition rate of 91% approximately [74].
2.2.6 Decision Tree Classifier
The concept of decision tree is decomposition of complex problem into
smaller, more manageable whereby it represents the relationship among
attribute and decision in a diagram that mimic to tree [75] [76]. The
classification is produced by algorithm that identifies various ways of splitting
a data into branch-like tree segment. The segmentations basically comprise 3
structures; internal node donates a test on attribute, branch node represent an
18
Chapter 2
45
outcome of the test, and leaf nodes represent class distribution. Decision Tree
has several algorithms such as ID3, C4.5 (extension of ID3), and CART.
Generally, tree induction and tree pruning are two main processes in Decision
tree classifier. Tree induction is an iterative training process which involves
splitting the attributes into smaller subsets. This process starts by analyzing the
whole dataset to find the condition attributes whereby when it is selected as a
splitting rule, it will result in nodes that are most different from each other
with respect to the target class. Subsequently, the tree will be generalized in
pruning process by removing least reliable tree branches and accuracy will be
improved [75].
A Decision Tree is a tree in which each branch node represents a choice
between a numbers of alternatives, and each leaf node represents a
classification or decision [75]. In the decision tree, the root and internal nodes
contain attribute test conditions to separate records that have different
characteristics.
Figure 2.13 Decision Tree
Documents you may be interested
Documents you may be interested