Open Access Open Access  Restricted Access Subscription Access

Post Mining Of Frequent Item Sets Using Mutual Information


 

Buyers’ Basket Analysis (BBA or Market Basket Analysis) is a typical example of frequent item set mining that leads to the discovery of association and correlations among items in large transactional or relational data sets to help retailers to develop marketing strategies by gaining insight into which items are frequently purchased together by customers. Association rules are considered interesting if they satisfy both a minimum support threshold and minimum confidence threshold, set by domain experts. Many efficient algorithms like Apriori, Partitioning, Sampling, Eclat etc. are available to generate large number of associated frequent item sets. Additional analysis can be performed to discover interesting statistical correlations between associated items. But the above mentioned correlation measures works in linear relationship between two variables with random distribution; this value alone may not be sufficient to evaluate a system where these assumptions are not valid. Finally we propose a new measure i.e. mutual information that will show dependency between frequent item sets of linear and parabolic datasets and generate stronger associated frequent item sets.


Keywords

Frequent Item sets, Association rules, mutual information
User
Notifications
Font Size

Abstract Views: 182

PDF Views: 0




  • Post Mining Of Frequent Item Sets Using Mutual Information

Abstract Views: 182  |  PDF Views: 0

Authors

Abstract


Buyers’ Basket Analysis (BBA or Market Basket Analysis) is a typical example of frequent item set mining that leads to the discovery of association and correlations among items in large transactional or relational data sets to help retailers to develop marketing strategies by gaining insight into which items are frequently purchased together by customers. Association rules are considered interesting if they satisfy both a minimum support threshold and minimum confidence threshold, set by domain experts. Many efficient algorithms like Apriori, Partitioning, Sampling, Eclat etc. are available to generate large number of associated frequent item sets. Additional analysis can be performed to discover interesting statistical correlations between associated items. But the above mentioned correlation measures works in linear relationship between two variables with random distribution; this value alone may not be sufficient to evaluate a system where these assumptions are not valid. Finally we propose a new measure i.e. mutual information that will show dependency between frequent item sets of linear and parabolic datasets and generate stronger associated frequent item sets.


Keywords


Frequent Item sets, Association rules, mutual information