The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


High-utility itemset mining (HUIM) is an important research topic in data mining field and extensive algorithms have been proposed. However, existing methods for HUIM present too many high-utility itemsets (HUIs), which reduces not only efficiency but also effectiveness of mining since users have to sift through a large number of HUIs to find useful ones. Recently a new representation, closed+ high-utility itemset (CHUI), has been proposed. With this concept, the number of HUIs is reduced massively. Existing methods adopt two phases to discover CHUIs from a transaction database. In phase I, an itemset is first checked whether it is closed. If the itemset is closed, an overestimation technique is adopted to set an upper bound of the utility of this itemset in the database. The itemsets whose overestimated utilities are no less than a given threshold are selected as candidate CHUIs. In phase II, the candidate CHUIs generated from phase 1 are verified through computing their utilities in the database. However, there are two problems in these methods. 1) The number of candidate CHUIs is usually very huge and extensive memory is required. 2) The method computing closed itemsets is time consuming. Thus in this paper we propose an efficient algorithm CloHUI for mining CHUIs from a transaction database. CloHUI does not generate any candidate CHUIs during the mining process, and verifies closed itemsets from a tree structure. We propose a strategy to make the verifying process faster. Extensive experiments have been performed on sparse and dense datasets to compare CloHUI with the state-of-the-art algorithm CHUD, the experiment results show that for dense datasets our proposed algorithm CloHUI significantly outperforms CHUD: it is more than an order of magnitude faster, and consumes less memory.

Keywords

Closed+ High-Utility Itemsets, Pattern Growth, Utility Mining.
User
Notifications
Font Size