The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


Background: An application get a global reach only if it is web based. Such types of applications are found existing in large. Storing and retrieval of information is always challenging task. Retrieving relevant data from high dimensional data is always very significant and complicated as well. Data mining plays a major role in the information retrieval process. Method: Grouping of data makes information retrieval easier. Clustering is one of the most important data mining techniques for grouping the data. Document clustering partitions the entire data into number of groups, where the data in each group should have large degree of resemblance. Findings: K-means algorithm is one of the most important portioning based algorithms which is easy to implement. Due to its time complexity, K-means can be hybridized with Harmony Search Method(HSM). HSM is a new meta-heuristic optimization method which imitates the music improvisation process. The various methodologies like Term Frequency-Inverse Document Frequency (TF-IDF), Coverage Factor (CF), Concept Factorization, Constrained based clustering have been applied on the same dataset to cluster the documents. A comparison has been made among all the above methodologies and an experimental result shows that constraint based clustering method has produced efficient clusters and it outperforms the other three methods. This constraint based clustering helps the input documents to be clustered in an effective way.

Keywords

Concept Factorization, Constraint Based Clustering, Coverage Factor, Harmony Search Method
User