Feature Selection for Machine Learning in Big Data
K. Kalpana1, G. Sunil Vijaya Kumar2, K. Madhavi3
1K. Kalpana, Research Scholar, Department of Computer Science and Engineering, Jawaharlal Nehru Technological University Anantapur, Andhra Pradesh, India.
2Dr. G. Sunil Vijaya Kumar, Professor, Department of Computer Science and Engineering, G. Pulla Reddy Engineering College, Kurnool, Andhra Pradesh, India.
3Dr. K. Madhavi, Associate Professor, Awaharlal Nehru Technological University College of Engineering, Ananthapuramu, Andhra Pradesh, India.
Manuscript received on 08 April 2019 | Revised Manuscript received on 15 April 2019 | Manuscript Published on 26 July 2019 | PP: 332-335 | Volume-8 Issue-6S4 April 2019 | Retrieval Number: F10670486S419/19©BEIESP | DOI: 10.35940/ijitee.F1067.0486S419
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: We are in the information age there by collecting very huge volume of data from diverse sources in structured, unstructured and semi structured form ranging to petabytes to exabytes of data. Data is an asset as valuable knowledge and information is hidden in such massive volumes of data. Data analytics is required to have a deeper insights and identify fine grained patterns so as to make accurate predictions enabling the improvement of decision making. Extracting knowledge from data is done by data analytics, Machine learning forms the core of it. The increase in the dimensionality of data both in terms of number of tuples and also in terms of number of features poses several challenges to the machine learning algorithms . Preprocessing of data is done as a prior step to machine learning, so feature selection is done as a preprocessing step to have the dimensionality reduction of the data and thereby removing the irrelevant features and improving the efficiency and accuracy of a machine learning algorithm. In this paper we are studying various feature selection mechanisms and analyze them whether they can be adopted to sentiment analysis of big data.
Keywords: Big Data, Machine Learning, Dimensionality Reduction, Feature Selection.
Scope of the Article: Machine Learning