Word and Chracter Segmentation in Devnagari and Odia Script – A Comparitive Analysis
Ipsita Pattnaik1, Tushar Patnaik2
1Ipsita Pattnaik*, M Tech Computer Science, C-DAC, Noida, India.
2Tushar Patnaik, Research & Development, C-DAC, Noida, India.
Manuscript received on June 19, 2020. | Revised Manuscript received on June 29, 2020. | Manuscript published on July 10, 2020. | PP: 377-382 | Volume-9 Issue-9, July 2020 | Retrieval Number: 100.1/ijitee.I7060079920 | DOI: 10.35940/ijitee.I7060.079920
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Optical Character Recognition has been an active research area in computer science for several years. Several research works undertaken on various languages in India. In this paper an attempt has been made to find out the percentage of accuracy in word and character segmentation of Hindi (National language of India) and Odia is one of the Regional Language mostly spoken in Odisha and a few Eastern India states. A comparative article has been published under this article. 10 sets of each printed Odia and Devanagari scripts with different word limits were used in this study. The documents were scanned at 300dpi before adopting pre-processing and segmentation procedure. The result shows that the percentage of accuracy both in word and character segmentation is higher in Odia language as compared to Hindi language. One of the reasons is the use of headers line in Hindi which makes the segmentation process cumbersome. Thus, it can be concluded that the accuracy level can vary from one language to the other and from word segmentation to that of the character segmentation.
Keywords: Shirorekha, Pre-processing, Segmentation, Devanagari and Odia Scripts.
Scope of the Article: Digital Signal Processing Theory