Correlation based Ensemble Feature Selection Algorithm for Diagnosis of Diabetics
R. Kuppuchamy1, T. Kamalavalli2, S. Vinothini3, N. Jayalakshmi4, N. Vallileka5
1R. Kuppuchamy, Department of MCA, PSNA College of Engineering and Technology, Dindigul (Tamil Nadu), India.
2T. Kamalavalli, Department of MCA, PSNA College of Engineering and Technology, Dindigul (Tamil Nadu), India.
3S. Vinothini, Department of MCA, PSNA College of Engineering and Technology, Dindigul (Tamil Nadu), India.
4N. Jayalakshmi, Department of MCA, PSNA College of Engineering and Technology, Dindigul (Tamil Nadu), India.
5N. Vallileka, Department of MCA, PSNA College of Engineering and Technology, Dindigul (Tamil Nadu), India.
Manuscript received on 11 January 2020 | Revised Manuscript received on 07 February 2020 | Manuscript Published on 20 February 2020 | PP: 373-377 | Volume-9 Issue-3S January 2020 | Retrieval Number: C10800193S20/2020©BEIESP | DOI: 10.35940/ijitee.C1080.0193S20
Open Access | Editorial and Publishing Policies | Cite | Zenodo | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: One of the preprocessing steps is data cleaning and feature selection in data mining. Feature selection has more efficiency regarding dimensionality reduction, eliminating irrelevant data, improving the accuracy and enhancing the output comprehensibility. This paper utilizes wrapper / hybrid-filter based feature selection method for feature selection and extraction from medical dataset. From the extracted information, the individual features are evaluated by calculating a rank value where it helps to choose highly correlated data from the entire dataset. Selected features are classified using the popular C4.5 classifier. To experiment the proposed method, the benchmark dataset is obtained from the UCI repository. It is a famous machine learning repository used by several earlier research works to evaluate the performance of their proposed methods. Finally, the accuracy of the classification method shows that our proposed method outperforms than the existing methods.
Keywords: C 4.5, Correlation, Information Gain, Feature Selection, Classification.
Scope of the Article: Algorithm Engineering