A Combined Filter Wrapper Classification Method for Gene Selection from Gene Expression Datasets
Suchishree Panda1, Kaberi Das2, Debahuti Mishra3, Ashwini Kumar Pradhan4
1Suchishree Panda, Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan Deemed to be University Bhubaneswar, Odisha, India.
2Kaberi Das, Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India.
3Debahuti Mishra, Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India.
4Ashwini Kumar Pradhan, Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India.
Manuscript received on 21 August 2019. | Revised Manuscript received on 07 September 2019. | Manuscript published on 30 September 2019. | PP: 1968-1975 | Volume-8 Issue-11, September 2019. | Retrieval Number: K21470981119/2019©BEIESP | DOI: 10.35940/ijitee.K2147.0981119
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: In the generic sense, Gene Selection methods are implemented upon a huge gene bank to decisively corner and expose certain genes that are indicative of say, diseases with their own set of classifications. The lightening surge about the DNA microarray dataset and its huge influence in the scientific realm has led different fields with the likes of Ecology, Bioinformatics, Computer Science, etc., making giant strides in their respective researches. DNA microarray research field threw open a desirable scope for path-breaking methods to be employed for gene selection, aimed at classifying those informative genes. Gene expression data classification is realized and aspired at the wake of a huge data size, boasting a usually miscellaneous yet a dissuasive composition that serves a challenge for data miners. The ideas and research work expressed below is a cohesive approach where a hybrid method linking every filter method (Information Gain / Pearson Correlation Coefficient / Relief-F) with that of wrapper (Genetic Algorithm / Forward Selection Backward Elimination / Practical Swam Optimization), through all permutation and combination, the accuracy of gene data (after being put through Support Vector Machine (SVM) model classifier) is optimized to the maximum and authenticated, yielding the optimum results in accordance with the requirements. Comparison between all filter, wrapper and hybrid methods are done by applying it on three microarray cancerous dataset.
Keywords: DNA microarray, Gene selection, High dimensionality, Gene expression, SVM,
Scope of the Article: Classification