The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


Objectives: The primary objective of this research paper is to design a new and efficient clustering technique to group user navigation patterns which are useful for classification system to classify a new user with the previous users group. Methodology: Three real time web log data sets are collected from e-commerce web server, academic institution web server and a research journal web server. All three sets were collected from IIS web servers. After navigation patterns are derived from preprocessing step it is clustered into groups by using traditional Fuzzy C-Means technique. The clusters are validated and re-clustered using Bolzano_Weierstrass Theorem. Findings: Web log data is preprocessed and ICA is applied in the user session matrix to select relevant and important features. To measure the clustering accuracy of proposed and the existing methods, the parameters such as Rand Index, F measure are calculated and compared. It shows proposed BWFCM have higher rand index rate than FCM and lesser error rate. To understand the impact of the feature selection method, the data sets were implemented with the existing and proposed methods of feature selection. The parameters taken for comparison were Rand Index, Sum of Squared Errors, F-measure. The method was implemented in all the three data sets after data cleaning, session construction step. Clustering was carried out twice with the proposed clustering algorithm in all the three data sets, without selecting features and after selecting features. It was observed that the clustering results are poor when applied in full data set with irrelevant features, and the performance was increased after relevant features were selected. Conclusion: The result of the optimized clustering proves its significance and there is an increase in similarity of intra clustering and dissimilarity in inter clustering than the existing methods.

Keywords

Bolzano_Weierstrass Theorem, Clustering, Feature Selection, Navigation Patterns, Web Usage Mining
User