Enhancing the Performance of Large-scale Profitable Itemset Mining using Efficient Data Structures
A Muralidhar1, Aditya Ashwini Kumar Sathe2, Pattabiraman V3
1Muralidhar A, SCSE, Vellore Institute of Technology, Chennai (Tamil Nadu), India.
2Aditya Ashwinikumar Sathe, SCSE, Vellore Institute of Technology, Chennai (Tamil Nadu), India.
3Pattabiraman V, SCSE, Vellore Institute of Technology, Chennai (Tamil Nadu), India.
Manuscript received on 25 June 2019 | Revised Manuscript received on 05 July 2019 | Manuscript published on 30 July 2019 | PP: 1768-1772 | Volume-8 Issue-9, July 2019 | Retrieval Number: I8151078919/19©BEIESP | DOI: 10.35940/ijitee.I8151.078919
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The process of extracting the most frequently bought items from a transactional database is termed as frequent itemset mining. Although it provides us with an idea of the best-selling itemsets, the method fails to identify the most profitable items from the database. It is not uncommon to have minimal intersection between frequent itemsets and profitable itemsets, and the process of extracting the most profitable itemsets is termed as Greater Profitable Itemset (GPI) mining. There have been various approaches to mine GPI in which [7] proposed a two-phased algorithm to optimize regeneration of GPI when the profit value of any item changes. This constituted of keeping track of the pruned items in the first phase and using it to efficiently regenerate GPI in the second phase. This paper proposes an enhancement to the way these changes are tracked by storing the pruned itemsets according to their constituent items, unlike the earlier algorithm that stored records iteration wise. By storing the itemsets according to their constituent items, we make sure that only the required items are being retrieved. In contrast, the earlier algorithm would fetch all the items pruned in any iteration, regardless of its relevance. By fetching only relevant itemset, the proposed method would significantly bring down the computational requirements.
Keywords: Enhancing the Performance Although it Provides Algorithm
Scope of the Article: High Performance Concrete