Synonym Based Document Clustering Using Thesaurus

A. Rajeswari; M. Kannan

Synonym Based Document Clustering Using Thesaurus

Affiliations
1 Theivanai Ammal Women's College, Villupuram, Tamil Nadu, India
2 Department of Computer Science and Applications, SCSVMV University, Kanchipuram, Tamil Nadu, India

Subscribe/Renew Journal

Abstract
References
Article Metrics
Refbacks

A Synonym based document clustering approach is proposed to cluster more document related to the user query. The synonym of the word is got from online thesaurus. Document clustering is one of the concepts in data mining. Many techniques are used for clustering. In the existing synonyms of the word and their synonyms stored in the database by the user. User should store all the words one by one so it takes more time. Sometimes all the words could not be stored in the database. If the word has more than one synonym it will be complex. In this proposed synonyms are got from the thesaurus.com (online library). In this method both the user entered keyword and their synonyms also clustered. Tf-idf method is used for ranking the clustered documents by using c#.net code. So it gives more relevant and accurate results of the user query. For experimental purpose we have used some text files. It gives better performance than the existing method and there is no need to maintain the database.