H10320688S319 - International Journal of Innovative Technology and Exploring Engineering (IJITEE)

Preprocessing for Parts of Speech (POS) Tagging in Dogri Language
Shivangi Dutta¹, Bhavna Arora²

¹Shivangi Dutta , Department of Computer Science & IT, Central University of Jammu, Jammu, India.

²Bhavna Arora, Department of Computer Science & IT, Central University of Jammu, Jammu, India.

Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Natural language processing (NLP) is viewed among the most crucial fields of computer science, information retrieval and artificial intelligence. One such challenging feature in NLP is Parts of speech (POS) tagging. It is the process of labelling the words present in the corpus as the parts of speech. According to English grammar there are eight major parts of speech which are: noun, pronoun, verb, adjective, adverb, preposition, conjunction, interjection. Over the past few years, various researchers have compassed considerable amount of work using various pursues to closely supervised tagging and unmonitored ta gging. These methods of labelling are further divided into rules-based, stochastic and hybrid approaches. The language that has been taken for research work is Dogri Language which is based on Devanagari script. The paper presents the related work in the languages having same script as Dogri. The study helps in the selection of appropriate technique to be used for POS tagging for Dogri language. The paper also presents grammatical and inflectional analysis of Dogri language along with few rules for designing POS tagger. A section of the paper also demonstrates the results of preprocessing i.e. tokenization and stemming of Dogri text, which are considered as the initial steps in POS tagging.

Keywords: Dogri language, Parts of speech tagging, stemming, tokenization.
Scope of the Article: Natural Language Processing

Download PDF

JOURNAL

REQUIREMENTS

PRODUCT

CONTACT US