Text Graph- An Enhanced Graph Fusion Model for Document Clustering
M.Uma Maheswari1, J.G.R.Sathiaseelan2
1M.Uma Maheshwari, Research Scholar, Department of Computer Science, Bishop Heber College, Tiruchirappalli, Tamil Nadu, India.
2Dr. J. G. R. Sathiaseelan, Associate Professor & Head, Department of Computer Science, Bishop Heber College, Tiruchirappalli, Tamil Nadu, India.
Manuscript received on 17 May 2019 | Revised Manuscript received on 24 May 2019 | Manuscript Published on 02 June 2019 | PP: 640-644 | Volume-8 Issue-7S2 May 2019 | Retrieval Number: G11090587S219/19©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Text clustering is a well-known method for refining the eminence in information retrieval, which groups a huge number of unordered text documents into the subgroup of associated documents. It is a contemporary test to investigate minimized and meaningful experiences from substantial accumulations of the unstructured content reports. Different clustering techniques are in use to make the clusters in the text document accessible. This paper introduces a new technique of document clustering based on graph model. The collection of documents is denoted as the graphical network in which the node represents a document and an edge represents the similarity between the two documents. This paper intends a Text Graph algorithm based on the graph structure. The unstructured documents contain a vast number of features; it must be reduced before graph construction. The count and semantic-based feature reduction methods are used to select the vital features. Based on this feature, the algorithm constructs the text graph structures. This paper combines these (word count and semantic) text graph structure to generate a fusion graph model. In, fusion model, each document is associated to its k-nearest neighbors with weighted edges. Finally, on the fused Text Graph, the clustering is performed to group the documents. Experimentations are accompanied on real-time text datasets. The outcomes demonstrated that the proposed fusion graph model overpowers the prevailing methods and improves the outcome of text document clustering techniques in rapports with the purity and normalized mutual information.
Keywords: Text Document Clustering, Graph Model, Feature Selection, Semantic Word Frequency.
Scope of the Article: Computer Science and Its Applications