Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Density Conscious Subspace Clustering Using ITL Data Structure


Affiliations
1 Department of Information Technology, Bannari Amman Institute of Technology, Tamil Nadu, India
2 Alpha Engineering College, Chennai, Tamil Nadu, India
     

   Subscribe/Renew Journal


Most of the subspace clustering algorithms uses monotonicity property to generate higher dimensional subspaces. But this property is not applicable here since different subspace cardinalities have varying densities ie., if a k-dimensional unit is dense, any (k-1) dimensional projection of this unit may not be dense. So in DENCOS a mechanism to compute upper bounds of region densities to constrain the search of dense regions is devised, where the regions whose density upper bounds are lower than the density thresholds will be pruned away in identifying the dense regions. They compute the region density upper bounds by utilizing a data structure, DFP-tree to store the summarized information of the dense regions. DFP-Tree employs FP-Growth algorithm and builds an FP-Tree based on the prefix tree concept and uses it during the entire subspace identification process. This method performs repeated horizontal traversals of the data to generate relevant subspaces which is time consuming. To reduce the time complexity, we employ ITL data structure to build Density Conscious ITL (DITL) tree to be used in the entire subspace identification process. ITL reduces the cost by scanning the database only once, by significantly reducing the horizontal traversals of the database. The algorithm is evaluated through experiments on a collection of benchmark data sets datasets. Experimental results have shown favourable performance compared with other popular clustering algorithms.

Keywords

Subspace Clustering, ITL Tree, Recall, Precision.
Subscription Login to verify subscription
User
Notifications
Font Size

Abstract Views: 165

PDF Views: 0




  • Density Conscious Subspace Clustering Using ITL Data Structure

Abstract Views: 165  |  PDF Views: 0

Authors

C. Palanisamy
Department of Information Technology, Bannari Amman Institute of Technology, Tamil Nadu, India
S. Selvan
Alpha Engineering College, Chennai, Tamil Nadu, India

Abstract


Most of the subspace clustering algorithms uses monotonicity property to generate higher dimensional subspaces. But this property is not applicable here since different subspace cardinalities have varying densities ie., if a k-dimensional unit is dense, any (k-1) dimensional projection of this unit may not be dense. So in DENCOS a mechanism to compute upper bounds of region densities to constrain the search of dense regions is devised, where the regions whose density upper bounds are lower than the density thresholds will be pruned away in identifying the dense regions. They compute the region density upper bounds by utilizing a data structure, DFP-tree to store the summarized information of the dense regions. DFP-Tree employs FP-Growth algorithm and builds an FP-Tree based on the prefix tree concept and uses it during the entire subspace identification process. This method performs repeated horizontal traversals of the data to generate relevant subspaces which is time consuming. To reduce the time complexity, we employ ITL data structure to build Density Conscious ITL (DITL) tree to be used in the entire subspace identification process. ITL reduces the cost by scanning the database only once, by significantly reducing the horizontal traversals of the database. The algorithm is evaluated through experiments on a collection of benchmark data sets datasets. Experimental results have shown favourable performance compared with other popular clustering algorithms.

Keywords


Subspace Clustering, ITL Tree, Recall, Precision.