An Improved Dragonfly Optimization Algorithm based Feature Selection in High Dimensional Gene Expression Analysis for Lung Cancer Recognition
F. Leena vinmalar1, A. Kumar Kombaiya2
1F. Leena vinmalar*, Research Scholar, Department of Computer science, Chikkanna Government Arts College-Tirupur, India.
2Dr. A. Kumar Kombaiya, Assistant Professor, Department of Computer Science Chikkanna Government Arts College-Tirupur, India.
Manuscript received on May 16, 2020. | Revised Manuscript received on June 05, 2020. | Manuscript published on June 10, 2020. | PP: 896-908 | Volume-9 Issue-8, June 2020. | Retrieval Number: H6302069820/2020©BEIESP | DOI: 10.35940/ijitee.H6302.069820
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: A microarray gene expression data is an efficient dataset for analyzing expression of thousands of genes and related disease. The more accurate analysis can be obtained by comparing Gene expression of disease tissues with normal tissues which helps to recognize the type of cancer. The processing of microarray datasets such as feature selection, sampling and classification is highly challenged due to its high dimensionality. Many recent researchers used various feature selection techniques for dimensionality reduction. Dragonfly optimization Algorithm (DA) was a feature selection technique used to reduce the dimensionality of lung cancer gene expression dataset. The dragonflies in DA are flying randomly based on the model developed by using the Levy Flight Mechanism (LFM). Because of huge searching steps, LFM has some drawbacks like interruption of arbitrary flights and overflowing of the search area. In fact, DA lacks an internal resemblance that record past potential solutions that can lead to its premature convergence into local optima. So, in this paper an Improved Dragonfly optimization Algorithm (IDA) is introduced which effectively reduces the dimensionality of the lung cancer gene expression dataset. In IDA, Brownian motion method is used to solve the issues of LFM and pbest and gbest idea of Particle Swarm Optimization (PSO) is used to direct the search method for finding potential candidate solutions to further refine the search space for avoiding premature convergence. The wrapper feature selection approach is followed by IDA to select optimal subset of features. The Random Sub space (RS), Artificial Neural Network (ANN) and Sequential Minimal Optimization (SMO) classifiers are utilized for feature selection of IDA and recognize Lung cancer subtypes. The accuracy of the classifier for selected features of Dragon flies in training instances is used as fitness value of Dragon flies in each iteration. Finally, the experimental results prove the effectiveness of the IDA in terms of accuracy, precision, recall and F-measure.
Keywords: Lung cancer recognition; gene expression data; Dragonfly optimization Algorithm; Improved Dragonfly optimization Algorithm; Brownian motion method.
Scope of the Article: Discrete Optimization