An Improved Bisecting K-Means Algorithm for Text Document Clustering

Janani Balakumar; S. Vijayarani

An Improved Bisecting K-Means Algorithm for Text Document Clustering

Janani Balakumar , S. Vijayarani

Affiliations
1 Bharathiar University, Coimbatore, Tamil Nadu, India

Subscribe/Renew Journal

Abstract
References
Article Metrics
Refbacks

Cluster analysis is an unsupervised learning approach that aims to group the objects into different groups or clusters. So that each cluster can contain similar objects with respect to any predefined condition. Text document clustering is the important technique of text mining in efficiently organizing the large volume of documents into a small number of significant clusters. The main objective of this research work is to cluster the collection of documents into related groups based on the contents of the particular documents. In order to perform this clustering task, this research work makes use of two existing algorithms, namely K-means and Bisecting K-means algorithm, and also this research work proposes a new clustering algorithm namely Enhanced-Bisecting K-means algorithm. From the experimental results it is observed that the proposed algorithm gives the better clustering accuracy than other algorithms.

Keywords

Text Mining, Text Document Clustering, K-Means, Bisecting K-Means, Enhanced Bisecting K-Means.

I-Scholar

Journal Help

Subscription Login to verify subscription

User

Notifications

Journal Content
Browse

Font Size

Information

Steinbach, M., Karypis, G., & Kumar, V. (2000). A Comparison of Document Clustering Techniques.

Proceedings of Knowledge Discovery and Data Mining (KDD) Workshop Text Mining.

Baghel, R., & Dhir, R. (2010). A frequent concepts based document clustering algorithm. International Journal of Computer Applications, July, 4(5), 6-12.

Li, Y., Lv, X., Liu, Y., & Shi, S. (2010). Research on text clustering based on concept weight. 4^th International Conference on Genetic and Evolutionary Computing.

Napoleon, D., & Pavalakodi, S. (2011). A new method for dimensionality reduction using k-means clustering algorithm for high dimensional data set. International Journal of Computer Applications, January, 13(7), 41-46.

Liu, M., He, Y., & Hu, H. (2004). Web fuzzy clustering and its applications in web usage mining. Proceedings of 8^th International Symposium on Future Software Technology.

Katariya, N. P., & Chaudhari, M. S. (2015). Bisecting kmeans algorithm for text clustering. International Journal of Advanced Research in Computer Science and Software Engineering, Februrary, 5(2), 221-223.

Uncu, O., Gruver, W. A., Kotak, D. B., Sabaz, D., Alibhai, Z., & Ng, C. (2006). GRIDBSCAN: Grid densitybased spatial clustering of applications with noise.

IEEE International Conference on Systems, Man, and Cybernetics, October 8-11, Taipei, Taiwan.

Han, J., & Kambr, M. (2001). Data Mining: Concepts and Techniques. Hand Book. Beijing: Higher Education Press.

Thangamani, M., & Thangaraj, P. (2010). Ontology based fuzzy document clustering scheme. Modern Applied Science, July, 4(7), 148-156.

Jayabharathy, J., Kanmani, S., & Parveen, A. (2011). Document Clustering and Topic Discovery based on Semantic Similarity in Scientific Literature.

Beil, F., Ester, M., Xu, X. (2002). Frequent term-based text clustering. ACM 1-58113-567-X/02/0007.

Deng, J., Hu, J. L., Chi, H., & Wu, J. (2010). An improved fuzzy clustering method for text mining.

^nd International Conference on Networks Security, Wireless Communications and Trusted Computing.

Hamzah, A., Susanto, A., Soesianto, F., & Istyanto, J. E.(2007). Concept based text document clustering.

Proceedings of International Conference on Electrical Engineering and Informatics, Indonesia June.

Ji, J., Chan, T. Y. T., & Zhao, Q. (2009). Fast document clustering based on weighted comparative advantage Proceedings of IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October.

Abstract Views: 308

PDF Views: 1

International Journal of Knowledge Based Computer System

An Improved Bisecting K-Means Algorithm for Text Document Clustering

Subscribe/Renew Journal

Keywords

An Improved Bisecting K-Means Algorithm for Text Document Clustering

Authors

Abstract

Keywords

References

Username
Password
Remember me

Username
Password
Remember me