Clustering of Multidimensional Big Data using Enhanced K-Mean Algorithm
Jagdish Kushwah1, Shailesh Jaloree2, R.S.Thakur3
1Jagdish Kushwaha*, Currently Pursuing Ph.D. Degree Program in Computer Applications in BU Bhopal.
2Dr.(Prof) Shailesh Jaloree, Professor Department of Computer Science & Applied Mathematics SATI Vidisha (MP).
3Dr (Prof) R.S.Thakur ,Professor and HOD Department of Mathematics, Bioinformatics & Computer Application MANIT Bhopal(MP).
Manuscript received on March 15, 2020. | Revised Manuscript received on March 25, 2020. | Manuscript published on April 10, 2020. | PP: 630-632 | Volume-9 Issue-6, April 2020. | Retrieval Number: F4126049620/2020©BEIESP | DOI: 10.35940/ijitee.F4126.049620
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract:One of the basic issues with K-means clustering is that it just merges to nearby optimum which is simpler than comprehending for worldwide optima however can prompt less ideal union. This is especially valid for enormous information as the underlying focuses assume a significant job on the exhibition of this calculation. The paper proposes a novel K-means clustering algorithm which presents a technique to discover advanced area of beginning focuses and introductory number of bunches. This outcome in getting last arrangement of bunches to meet internationally, encouraging quick and exact grouping over enormous datasets. Distributed computing executes huge scope and complex processing. A lot of information are economically and proficiently broke down by utilizing parallelism method. To get parallelism and versatile registering, using Amazon Web Services with R Studio Flexible Process Cloud occasion which partitions the activity among different hubs. The proposed system presents an exceptionally serious exhibition taking significant less calculation time and financially savvy. It very well may be contrasted with complex Hadoop Disseminated Record Framework and MapReduce A significant disadvantage with Apache Hadoop is its MapReduce worldview that is exceptionally open when a procedure emphasizes number of times. R performs execution inside memory which is quicker and less mind boggling when contrasted with Read/Keep in touch with the circle over and again in MapReduce. The examination work is mimicked on some well known genuine datasets from UCI AI storehouse. The outcomes affirm that the proposed work models a vigorous and adaptable procedure for grouping huge datasets.
Keywords: Artificial Intelligence, Big Data, Cloud Computing, K-means, MapReduce; R.
Scope of the Article: Artificial Intelligent Methods, Models, Techniques