Open Access Open Access  Restricted Access Subscription Access

A Shared Nearest Neighbour Density based Clustering Approach on a Proclus Method to Cluster High Dimensional Data


Affiliations
1 Bharathiar University, Coimbatore - 641046, Tamil Nadu, India
2 Queen Mary's College, Chennai - 600004, Tamil Nadu, India
 

Background/Objective: A high dimensional data is a dataset that ranges from a few to a hundreds of dimensions. Clustering such datasets needs an efficient algorithm such as Proclus but the algorithm has a drawback of ignoring cluster with small data points. So the proposed paper gives an ensemble of clustering that combines technique of two clustering algorithms to achieve a quality cluster of even small data points. Methods/Statistical Analysis: The research paper adapts a novel method of implementing a density based approach over a Proclus algorithm to cluster even small data points. These combined algorithms are tested using synthetic datasets. The Proclus algorithm is modified at a specific point where the density based algorithm is implemented. Findings: The results of the proposed algorithm are found to contain more clusters than mere Proclus algorithm does. The results is as such because in Proclus clustering the data point whose size is small are ignored so that only clusters with large number of data points can exists. However after the involvement of the shared nearest neighbor density based algorithm even the small data points are clustered which paves way for a more accurate and an efficient clustering process especially in a high dimensional data. Applications/Improvements: The application is a combination of two efficient algorithms but implemented in a simple way thereby reducing the complexity of the algorithm. The proposed technique can be applied on all high dimensional datasets irrespective of their sizes and shapes.

Keywords

Density based Approach, High Dimensional Data, Proclus, SNN Algorithm.
User

Abstract Views: 146

PDF Views: 0




  • A Shared Nearest Neighbour Density based Clustering Approach on a Proclus Method to Cluster High Dimensional Data

Abstract Views: 146  |  PDF Views: 0

Authors

S. Gayathri
Bharathiar University, Coimbatore - 641046, Tamil Nadu, India
M. Mary Metilda
Queen Mary's College, Chennai - 600004, Tamil Nadu, India
S. Sanjai Babu
Bharathiar University, Coimbatore - 641046, Tamil Nadu, India

Abstract


Background/Objective: A high dimensional data is a dataset that ranges from a few to a hundreds of dimensions. Clustering such datasets needs an efficient algorithm such as Proclus but the algorithm has a drawback of ignoring cluster with small data points. So the proposed paper gives an ensemble of clustering that combines technique of two clustering algorithms to achieve a quality cluster of even small data points. Methods/Statistical Analysis: The research paper adapts a novel method of implementing a density based approach over a Proclus algorithm to cluster even small data points. These combined algorithms are tested using synthetic datasets. The Proclus algorithm is modified at a specific point where the density based algorithm is implemented. Findings: The results of the proposed algorithm are found to contain more clusters than mere Proclus algorithm does. The results is as such because in Proclus clustering the data point whose size is small are ignored so that only clusters with large number of data points can exists. However after the involvement of the shared nearest neighbor density based algorithm even the small data points are clustered which paves way for a more accurate and an efficient clustering process especially in a high dimensional data. Applications/Improvements: The application is a combination of two efficient algorithms but implemented in a simple way thereby reducing the complexity of the algorithm. The proposed technique can be applied on all high dimensional datasets irrespective of their sizes and shapes.

Keywords


Density based Approach, High Dimensional Data, Proclus, SNN Algorithm.



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i22%2F141641