Feature Selection Methods for Mining Social Media
SV Mageshwari1, I. Laurence Aroquiaraj2
1V Mageshwari*, Department of Computer Science, Periyar University, Salem, India.
2Dr. I. Laurence Aroquiaraj, Department of Computer Science, Periyar University, Salem, India.
Manuscript received on October 13, 2019. | Revised Manuscript received on 23 October, 2019. | Manuscript published on November 10, 2019. | PP: 4060-4064 | Volume-9 Issue-1, November 2019. | Retrieval Number: A6120119119/2019©BEIESP | DOI: 10.35940/ijitee.A6120.119119
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: People can share their thoughts and opinion through Social Media which can easily widespread. So many public issues and political views are also discussed on social media. HIV/AIDS is also one of the important topics discussed. This work aims to classify HIV/AIDS related twitter data. Since the twitter data is highly dimensional, it is essential to do reduce dimensionality of the data to attain better classification results. Tweets are collected using keyword search and necessary preprocessing steps are carried out. Then feature extraction methods such as Bag of Words (BOW) model and TF-IDF are implemented. Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) techniques are used for dimensionality reduction. Finally, classification is carried out and the results are discussed.
Keywords: Tweets, Pre-processing, BOW, TF-IDF, SVD, PCA, Classification.
Scope of the Article: Classification