Association rule mining with modified apriori algorithm using top down approach

Ashish Shah

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Data Mining is a field of computer science that is concerned with extracting useful information from varied sources. In an era where information has become the inherent necessity of human beings, its increased relevance and usefulness has taken focus as need of the hour. The most important part of this association rule mining is the mining of item sets that are frequent. Market basket analysis is done by companies in order to retrieve itemsets that are frequent and often used together by customers. Apriori algorithm is a widely used technique in order to find those combinations of itemsets. However, when any of these frequent itemsets increases in length, the algorithm needs to pass through many iterations and, as a result, the performance drastically decreases. In this paper, we propose a modification to the apriori algorithm by using a hash function which divides the frequent item sets into buckets. Further, we propose a novel technique to be used in conjunction with the apriori algorithm by eliminating infrequent itemsets from the candidate set. In this top down approach, it finds the frequent itemsets without going through several iterations, thus saving time and space. By discovering a large maximal frequent itemset very early in the algorithm, all its subsets are also frequent hence we no longer need to scan them. Clearly, the proposed technique has an advantage over the existing apriori algorithm when the most frequent itemset's length is long.

Original languageEnglish
Title of host publicationProceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages747-752
Number of pages6
ISBN (Electronic)9781509023981
DOIs
Publication statusPublished - 25-04-2017
Externally publishedYes
Event2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016 - Bengaluru, Karnataka, India
Duration: 21-07-201623-07-2016

Conference

Conference2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016
CountryIndia
CityBengaluru, Karnataka
Period21-07-1623-07-16

Fingerprint

Association rules
Hash functions
Set theory
Computer science
Data mining
Industry

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Signal Processing
  • Software

Cite this

Shah, A. (2017). Association rule mining with modified apriori algorithm using top down approach. In Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016 (pp. 747-752). [7912099] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICATCCT.2016.7912099
Shah, Ashish. / Association rule mining with modified apriori algorithm using top down approach. Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 747-752
@inproceedings{6042f48bad0541baa6663d215274f6cb,
title = "Association rule mining with modified apriori algorithm using top down approach",
abstract = "Data Mining is a field of computer science that is concerned with extracting useful information from varied sources. In an era where information has become the inherent necessity of human beings, its increased relevance and usefulness has taken focus as need of the hour. The most important part of this association rule mining is the mining of item sets that are frequent. Market basket analysis is done by companies in order to retrieve itemsets that are frequent and often used together by customers. Apriori algorithm is a widely used technique in order to find those combinations of itemsets. However, when any of these frequent itemsets increases in length, the algorithm needs to pass through many iterations and, as a result, the performance drastically decreases. In this paper, we propose a modification to the apriori algorithm by using a hash function which divides the frequent item sets into buckets. Further, we propose a novel technique to be used in conjunction with the apriori algorithm by eliminating infrequent itemsets from the candidate set. In this top down approach, it finds the frequent itemsets without going through several iterations, thus saving time and space. By discovering a large maximal frequent itemset very early in the algorithm, all its subsets are also frequent hence we no longer need to scan them. Clearly, the proposed technique has an advantage over the existing apriori algorithm when the most frequent itemset's length is long.",
author = "Ashish Shah",
year = "2017",
month = "4",
day = "25",
doi = "10.1109/ICATCCT.2016.7912099",
language = "English",
pages = "747--752",
booktitle = "Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Shah, A 2017, Association rule mining with modified apriori algorithm using top down approach. in Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016., 7912099, Institute of Electrical and Electronics Engineers Inc., pp. 747-752, 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016, Bengaluru, Karnataka, India, 21-07-16. https://doi.org/10.1109/ICATCCT.2016.7912099

Association rule mining with modified apriori algorithm using top down approach. / Shah, Ashish.

Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016. Institute of Electrical and Electronics Engineers Inc., 2017. p. 747-752 7912099.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Association rule mining with modified apriori algorithm using top down approach

AU - Shah, Ashish

PY - 2017/4/25

Y1 - 2017/4/25

N2 - Data Mining is a field of computer science that is concerned with extracting useful information from varied sources. In an era where information has become the inherent necessity of human beings, its increased relevance and usefulness has taken focus as need of the hour. The most important part of this association rule mining is the mining of item sets that are frequent. Market basket analysis is done by companies in order to retrieve itemsets that are frequent and often used together by customers. Apriori algorithm is a widely used technique in order to find those combinations of itemsets. However, when any of these frequent itemsets increases in length, the algorithm needs to pass through many iterations and, as a result, the performance drastically decreases. In this paper, we propose a modification to the apriori algorithm by using a hash function which divides the frequent item sets into buckets. Further, we propose a novel technique to be used in conjunction with the apriori algorithm by eliminating infrequent itemsets from the candidate set. In this top down approach, it finds the frequent itemsets without going through several iterations, thus saving time and space. By discovering a large maximal frequent itemset very early in the algorithm, all its subsets are also frequent hence we no longer need to scan them. Clearly, the proposed technique has an advantage over the existing apriori algorithm when the most frequent itemset's length is long.

AB - Data Mining is a field of computer science that is concerned with extracting useful information from varied sources. In an era where information has become the inherent necessity of human beings, its increased relevance and usefulness has taken focus as need of the hour. The most important part of this association rule mining is the mining of item sets that are frequent. Market basket analysis is done by companies in order to retrieve itemsets that are frequent and often used together by customers. Apriori algorithm is a widely used technique in order to find those combinations of itemsets. However, when any of these frequent itemsets increases in length, the algorithm needs to pass through many iterations and, as a result, the performance drastically decreases. In this paper, we propose a modification to the apriori algorithm by using a hash function which divides the frequent item sets into buckets. Further, we propose a novel technique to be used in conjunction with the apriori algorithm by eliminating infrequent itemsets from the candidate set. In this top down approach, it finds the frequent itemsets without going through several iterations, thus saving time and space. By discovering a large maximal frequent itemset very early in the algorithm, all its subsets are also frequent hence we no longer need to scan them. Clearly, the proposed technique has an advantage over the existing apriori algorithm when the most frequent itemset's length is long.

UR - http://www.scopus.com/inward/record.url?scp=85020196801&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85020196801&partnerID=8YFLogxK

U2 - 10.1109/ICATCCT.2016.7912099

DO - 10.1109/ICATCCT.2016.7912099

M3 - Conference contribution

AN - SCOPUS:85020196801

SP - 747

EP - 752

BT - Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Shah A. Association rule mining with modified apriori algorithm using top down approach. In Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016. Institute of Electrical and Electronics Engineers Inc. 2017. p. 747-752. 7912099 https://doi.org/10.1109/ICATCCT.2016.7912099