Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Bisecting K-Means Clustering Approach for High Dimensional Dataset


Affiliations
1 Bharathiar University, Coimbatore. Tamilnadu, India
2 MCA Department, with the K.S. Rangasamy College of Technology, Tiruchengode, Tamil Nadu, India
     

   Subscribe/Renew Journal


High dimensional data is phenomenon in real-world data mining applications. Developing effective clustering methods for high dimensional dataset is a challenging problem due to the curse of dimensionality. Usually k-means clustering algorithm is used but it results in time consuming, computationally expensive and the quality of the resulting clusters depends on the selection of initial centroid and the dimension of the data. The accuracy of the resultant value perhaps not up to the level of expectation when the dimension of the dataset is high because we cannot say that the dataset chosen are free from noisy and flawless. Hence to improve the efficiency and accuracy of mining task on high dimensional data, the data must be pre-processed by an efficient dimensionality reduction method. This paper proposes a method in which the high dimensional data is reduced through Principal Component Analysis and then bisecting k-means clustering is performed on the reduced data where there is no initialization of the centroids.

Keywords

Bisecting K-Means, Dimensionality Reduction, K-Means, Principal Component Analysis, Principal Components.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 190

PDF Views: 1




  • Bisecting K-Means Clustering Approach for High Dimensional Dataset

Abstract Views: 190  |  PDF Views: 1

Authors

R. Indhumathi
Bharathiar University, Coimbatore. Tamilnadu, India
S. Sathiyabama
MCA Department, with the K.S. Rangasamy College of Technology, Tiruchengode, Tamil Nadu, India

Abstract


High dimensional data is phenomenon in real-world data mining applications. Developing effective clustering methods for high dimensional dataset is a challenging problem due to the curse of dimensionality. Usually k-means clustering algorithm is used but it results in time consuming, computationally expensive and the quality of the resulting clusters depends on the selection of initial centroid and the dimension of the data. The accuracy of the resultant value perhaps not up to the level of expectation when the dimension of the dataset is high because we cannot say that the dataset chosen are free from noisy and flawless. Hence to improve the efficiency and accuracy of mining task on high dimensional data, the data must be pre-processed by an efficient dimensionality reduction method. This paper proposes a method in which the high dimensional data is reduced through Principal Component Analysis and then bisecting k-means clustering is performed on the reduced data where there is no initialization of the centroids.

Keywords


Bisecting K-Means, Dimensionality Reduction, K-Means, Principal Component Analysis, Principal Components.