A Combined Horizontal Parallel Apriori Algorithm and Adaptive Frequent Pattern Growth Algorithm for Big Data Mining
M. Sornalakshmi1, S. Balamurali2, M. Venkatesulu3
1M. Sornalakshmi, Department of Computer Applications, Kalasalingam Academy of Research and Education, (Tamil Nadu), India.
2S. Balamurali, Department of Computer Applications, Kalasalingam Academy of Research and Education, (Tamil Nadu), India.
3M. Venkatesulu, Department of Computer Applications, Kalasalingam Academy of Research and Education, (Tamil Nadu), India.
Manuscript received on 09 December 2019 | Revised Manuscript received on 21 December 2019 | Manuscript Published on 30 December 2019 | PP: 859-863 | Volume-9 Issue-2S2 December 2019 | Retrieval Number: B11331292S219/2019©BEIESP | DOI: 10.35940/ijitee.B1133.1292S219
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Due to the massive data size and complexness, big data mining using a sole computer is a problematic task. With the rapid increase in the database size, parallel and distributed computing systems can yield better benefits in the data mining applications. Parallelization of the Association Rule Mining (ARM) algorithms is a significant task in the data mining application for effectively mining the frequent itemsets from the large-size databases. These mining algorithms allocate the database in a horizontal manner or increase the number of processors to decrease the overall time necessary for mining the frequent itemsets. In this paper, a combined Horizontal Parallel-Apriori (HP-Apriori) and Adaptive Frequent Pattern (FP) Growth algorithm is proposed to divide the database both horizontally and vertically into four sub-processes, for parallel processing of all four tasks. The Horizontal Parallel-Apriori algorithm increases the speed of the mining process using an index file. Adaptive Binomial Distribution (ABD) is applied to the Frequent Pattern Growth Algorithm to find the minimum support for mining the optimal frequent itemsets. Experimental analysis established that the combined algorithm outperforms in terms of minimizing the overall execution time and increasing the computational speed in high scalability.
Keywords: Apriori Algorithm, Big Data Mining, Frequent Pattern Growth Algorithm, Parallel and Distributed Processing.
Scope of the Article: Data Mining