CIP- Efficient Method for Mining Frequent Itemsets From Data Streams Using Landmark Window Model

F. Ramesh Dhanaseelan; M. JeyaSutha

CIP- Efficient Method for Mining Frequent Itemsets From Data Streams Using Landmark Window Model

Affiliations
1 Department of Computer Applications, St. Xavier’s Catholic College of Engineering, Chunkankadai - 03., India

Abstract
References
Article Metrics
Refbacks

Continuous stream transactions like network monitoring, retail market data analysis and stock market prediction need the “frequent patterns” to be detected recurrently. Literature suggests that several pattern mining solutions are being developed over years. Still lot of challenges need to be addressed due to rapidness in generation of continuous, unbounded and ordered data real time. Hence extraction of frequent patterns from recent data will improve the analysis of stream data. In this article, a new landmark window model CIP (candidate indexing and pruning) is considered for mining the datasets. CIP allows us to mine over entire history of data streams, which improves the accuracy. This article also proposes the candidate indexed sub (CIS)-tree scheme to extract the essential information from each incoming transactions of data streams. Our proposal is compared with the existing “improved data stream mining” (ISDM) for maximal frequent itemsets algorithm. Extensive experimental analyses prove the superiority of the proposed CIP over the popular ISDM in terms of accuracy and time complexity for high-speed data stream. This article also covers up a case study where the proposed approach is applied for an application called “web prefetching”.

Keywords

Data Streams, Frequent Itemsets, Pruning, Frequent Patterns, Web Prefetching.

I-Scholar

Journal Help

User

Notifications

Journal Content
Browse

Font Size

Information

Agrawal R, Srikant R (1994) Fast Algorithms for Mining Association Rules. In Proc. of VLDB, pp 487- 499

Agrawal R, Srikant R (1995) Mining Sequential Patterns. In Proc. of IDCE, pp 3-14

Liu B, Hsu W, Ma.Y (1998) Integrating Classification and Association Rule Mining. In Proc. of KDD

Wang H, Yang J, Wang W, Yu PS (2002) Clustering by Pattern Similarity in Large Datasets. In Proc. of SIGMOD, pp 394-405

Vimal Kumar D, Tamilarasi A (2013) An effective approach to mine relational patterns and its extensive analysis on multi-relational databases Int. J. of Data Mining, Modelling and Management, Vol.5, No.3, pp.277 - 297

Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. Proceedings of PODS, pp 1-16

Graham Cormode, Muthukrishnan S (2005) What’s Hot and What’s Not: Tracking Most Frequent Items Dynamically. ACM Transactions on Database Systems 30:249-278

Golab L, Ozsu MT (2003) Issues in data stream management. SIGMOD 32: 5-14

Jun Tan, Yingyong BU and Haiming Zhao (2010) Efficient Single-pass Frequent Itemsets Mining over Data Streams. Seventh IEEE International Conference on Fuzzy Systems and Knowledge Discovery, pp 1438-1431

Chang, Lee, Zhou (2003) Finding Recent Frequent Itemsets Adaptively over online Data Streams. ACM SIGKDD International Conference on knowledge Discovery and Data Mining, pp 487-492

Lukasz Golab, Theodore Johnson, and VladislavShkapenyuk (2012) Scalable Scheduling of Update in Streaming Data Warehouses. IEEE Transactions on Knowledge and Data Engineering 24:1095-1105

Nan Jiang, Le Gruenwald (2006) Research issues in Data Stream Association Rule Mining. SIGMOID Record, 35:1 [13] Sotiris Kotsiantis, DimitrisKanellopoulos (2006)

Association Rules Mining: A Recent Overview. GESTS International Transactions on Computer Science and Engineering, 32: 91-82

Li H, Lee S, Shan M(2004) An Efficient Algorithm for Mining Frequent Itemsets over Entire History of Data Streams. In Proc. of First International Workshop on Knowledge Discovery in Data Streams

Wang J, Han J, Pei J (2003) CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets. In Proc. of KDD, pp 236-245

Chang, Lee (2005) A sliding window method for finding recently frequent itemsets over online data streams. Journal of Information Science and Engineering pp 76-90

Chi Y, Wang H, Yu PS, Muntz RR (2004) Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window. In Proc. of ICDM, pp 59-66

Chih-hsiang Lin, Ding-ying Chiu, Yi-hung Wu (2005) Mining frequent itemsets from data streams with a time sensitive sliding window. SIAM International Conference on Data Mining, pp 486- 491

Dawar S, Sharma V, Goyal V, (2017) Mining top-k high-utility itemsets from a data stream under sliding window model, Applied Intelligence, 47(4), pp 1240–1255

Chang Y-I, Li C-E, Chou T-J, Yen C-Y (2018) A weight-order-based lattice algorithm for mining maximal weighted frequent patterns over a data stream sliding window, 2018 IEEE International Conference on Applied System Invention (ICASI), Chiba, Japan, 13-17 April 2018

Kuen-Fang Jea, Chao-Wei Li, Tsui-ping Chang (2008) An efficient approximate approach to mining frequent itemsets over high speed transactional data streams. IEEE Eight International Conference on Intelligent Systems Design and Applications, pp 275-280

Bo Li (2009) Finding Frequent Itemsets from Uncertain Transaction Streams. IEEE International Conference on Artificial Intelligence and Computational Intelligence, pp 331-335

Li, A., Xu, W., Liu, Z. et al(2021). Improved incremental local outlier detection for data streams based on the landmark window model. KnowlInfSyst 63, 2129–2155.

Kolomvatsos K and Anagnostopoulos C (2021), "Landmark based Outliers Detection in Pervasive Applications," 2021 12th International Conference on Information and Communication Systems (ICICS), 2021, pp. 201-206.

Lee D, Lee W(2005) Finding Maximal Frequent Itemsets over Online Data Streams Adaptively. In Proc. of ICDM, pp 1550-1505

Chang JH, Lee WS (2003) estWin: Adaptively Monitoring the Recent Change of Frequent Itemsets over Online Data Streams. In Proc. of CIKM, pp 536-539

Chernoff H (1952) A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the Sum of Observations. The Annals of Mathematical Statistics 23:493-507

Yu J, Chong Z, Lu H, Zhou A (2004) False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams. In Proc. of VLDB, pp 204-215

Chang JH, Lee WS (2003) Finding Recent Frequent Itemsets Adaptively over online Data Streams. In Proc. of KDD, pp 753-762

Giannella C, Han J, Pei J, Yan X, Yu PS (2003) Mining Frequent Patterns in Data Streams at Multiple Time Granularities. H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha (eds.) Next Generation Data Mining

Chen Y, Dong G, Han J, Wah B.W, Wang J (2002) Multidimensional Regression Analysis of Time- Series Data Streams. In Proc. of VLDB, pp 323-334

Gouda K, Zaki M (2001) Efficiently Mining Maximal Frequent Itemsets. In Proc. of ICDM

ToonCalders, NeleDexters, Bart Goethals (2008) Mining Frequent Itemsets in a Stream. Seventh IEEE International Conference on Data Mining, pp 83-92

RenJiadong, He Huiling, XuLina, Hu Changzhen (2009) DSMFI-Miner : An Algorithm for Mining Maximal Frequent Itemsets on Data Streams. IEEE Second International Workshop on Computer Science and Engineering, pp 139- 143

Alfredo Cuzzocrea, Fan Jiang, Wookey Lee, Carson K.Leung (2014) Efficient frequent Itemset Mining from Dense Data Streams. APWeb, Springer, (LNCS 8709), pp 593-601

Luigi Troiano, G. Scibelli (2013) A timeefficient breadth-first level-wise lattice-traversal algorithm to discover rare itemsets. Data Min. Knowl. Disc., Springer 27:1-35

Luigi Troiano, GiacomoScibelli (2014) Mining frequent itemsets in data streams within a time horizon. Data & Knowledge Engineering, Elsevier, 89:21-37

Hongjun Lu, YuetYeung Ng, ZenpingTian (2000) T-Tree or B-tree: main memory database index structure revisited. 11th IEEE Australasian database conference, pp 65-73

Kong Rim Choi, Kyung-Chang Kim (1996) T*- tree: a main memory database index structure for real time applications. IEEE workshop on real time computing systems and applications, 81-88

Yinmin Mao, Hong Li, Lumin Yang, Zhigang Chen, Lixin Liu (2009) A Mining Maximal FrequentItemsets over the Entire History of Data Streams. Proceeding of the First IEEE International Workshop on Database Technology and Applications, pp 413-419

Abstract Views: 69

PDF Views: 0

CIP- Efficient Method for Mining Frequent Itemsets From Data Streams Using Landmark Window Model

Abstract Views: 69 | PDF Views: 0

Authors

F. Ramesh Dhanaseelan
Department of Computer Applications, St. Xavier’s Catholic College of Engineering, Chunkankadai - 03., India

M. JeyaSutha
Department of Computer Applications, St. Xavier’s Catholic College of Engineering, Chunkankadai - 03., India

Abstract

Keywords

Data Streams, Frequent Itemsets, Pruning, Frequent Patterns, Web Prefetching.

Username
Password
Remember me

Username
Password
Remember me

International Journal of Advanced Networking and Applications

International Journal of Advanced Networking and Applications

CIP- Efficient Method for Mining Frequent Itemsets From Data Streams Using Landmark Window Model

Keywords

CIP- Efficient Method for Mining Frequent Itemsets From Data Streams Using Landmark Window Model

Authors

Abstract

Keywords

References