Classification of Cancer Gene Subtypes from Clustering of Gene Expression Data
Logenthiran Machap1, Afnizanfaizal Abdullah2, Zuraini Ali Shah3

1Logenthiran Machap, School of Computing, Faculty of Engineering, University Technology Malaysia, Johor, Malaysia.

2Afnizanfaizal Abdullah, School of Computing, Faculty of Engineering, University Technology Malaysia, Johor, Malaysia.

3Zuraini Ali Shah, School of Computing, Faculty of Engineering, University Technology Malaysia, Johor, Malaysia.

Manuscript received on 04 May 2019 | Revised Manuscript received on 09 May 2019 | Manuscript Published on 13 May 2019 | PP: 332-336 | Volume-8 Issue-7S May 2019 | Retrieval Number: G10590587S19/19©BEIESP

Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Typically, microarray gene expression data obscure imperative information which is necessary for the understanding of molecular biology processes that occurs in a specific organism with respect to its environment. Uncovering gene expression data’s invisible patterns will lead to a remarkable desire to enhance the interpretation of functional genomics. Biological networks intricacy and the presence of huge amount of genes raise the difficulties of understanding of the high dimension data, which resides lots of measurements. Thus, clustering techniques which are crucial in the data mining process are used as the first step to address this challenge to discover logical structures and predict significant patterns in the hidden data. These patterns may offer shreds of evidence about the biological process related to different physiological conditions. On deep, this paper focuses on the co-clustering algorithm to cluster genes and conditions simultaneously to obtain co-clusters further utilised for classification. The method called an improved network assisted co-clustering for the identification of cancer subtypes (iNCIS). Fundamentally, it integrates gene network information with gene expression to achieve biologically significant clusters. The classes obtained from clusters were used in the classification of genes to improve accuracy. This method applied to breast cancer and glioblastoma multiforme datasets. The discovered structures disclosed strong biological significance associations between functional annotations of genes with related conditions.

Keywords: Classification, Clustering, Gene Expression, Microarray.
Scope of the Article: Classification