Text Extraction and a Deep CNN Based Model for Character Classification in Kannada Documents
Sachin Bhat1, Seshikala G2
1Sachin Bhat, School of ECE, Reva University, Bengaluru/ SMVITM, Udupi, India.
2G Seshikala, School of ECE, Reva University, Bangalore, India.
Manuscript received on 02 June 2019 | Revised Manuscript received on 10 June 2019 | Manuscript published on 30 June 2019 | PP: 2957-2962 | Volume-8 Issue-8, June 2019 | Retrieval Number: H7418068819/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Pattern analysis in documents is one of the most interesting issues in the current research because of its wide are of of applications. It has leveraged its potential in reducing the manual work of converting the documents containing handwritten characters to machine-readable texts. The Deep Convolutional-Neural-Networks (DCNN) are successfully implemented for the recognition of characters in various languages. But due to high noise, degradation over a long time period, low contrast and intensity to separate the foreground text plays a spoiler in the extraction of characters from the document images. This paper proposes covers both the aspects including preprocessing of Kannada documents and a DCNN based architecture for the classification of Kannada language characters. Kannada is one of the 22 official languages in India spoken by more than 60 million people across the globe. This model is mainly developed to assist the character recognition of Kannada documents. A total of 84000 characters including both vowels and consonants have been included in the dataset. This architecture is showing a satisfactory test accuracy of 98.87% for the classification of 42 handwritten characters.
Keyword: CNN, Document Analysis, Image Enhancement, Optical Character Recognition.
Scope of the Article: Classification.