Open Access Open Access  Restricted Access Subscription Access

A Novel Clustering based Feature Subset Selection Framework for Effective Data Classification


Affiliations
1 Faculty of Computing, Botho University, Botswana and Department of Information Systems, BIUST, Botswana
2 Department of Information Systems, BIUST, Botswana
 

Background/Objectives: A novel feature selection framework using minimum variance method is proposed. The purpose of the proposed method is to reduce the computational complexity, reduce the number of initial features and increase the classification accuracy of the selected feature subsets. Methods/Statistical Analysis: The clusters are formed using minimum variance method. The process must be repeated for different pairs of records and voting is done on the different sets of cluster pairs. The cluster pair which has the maximum number of votes is chosen and the highest priority member is chosen from each cluster using information gain and removing the remaining attributes, thus reducing the number of attributes. Findings: The proposed feature selector is evaluated by comparing it with existing feature selection algorithms over 9 datasets from UCI and WebKb Datasets. The proposed method shows better results in terms of number of selected features, classification accuracy, and running time than most existing algorithms. Improvements/Applications: A new feature selector using minimum variance method is implemented and found that it performs better than the popular and computationally expensive traditional algorithms.

Keywords

Classification, Data Mining, Dimensionality Reduction, Feature Selection, Information Gain, Minimum Variance Method
User

Abstract Views: 190

PDF Views: 0




  • A Novel Clustering based Feature Subset Selection Framework for Effective Data Classification

Abstract Views: 190  |  PDF Views: 0

Authors

Sivakumar Venkataraman
Faculty of Computing, Botho University, Botswana and Department of Information Systems, BIUST, Botswana
Subitha Sivakumar
Faculty of Computing, Botho University, Botswana and Department of Information Systems, BIUST, Botswana
Rajalakshmi Selvaraj
Department of Information Systems, BIUST, Botswana

Abstract


Background/Objectives: A novel feature selection framework using minimum variance method is proposed. The purpose of the proposed method is to reduce the computational complexity, reduce the number of initial features and increase the classification accuracy of the selected feature subsets. Methods/Statistical Analysis: The clusters are formed using minimum variance method. The process must be repeated for different pairs of records and voting is done on the different sets of cluster pairs. The cluster pair which has the maximum number of votes is chosen and the highest priority member is chosen from each cluster using information gain and removing the remaining attributes, thus reducing the number of attributes. Findings: The proposed feature selector is evaluated by comparing it with existing feature selection algorithms over 9 datasets from UCI and WebKb Datasets. The proposed method shows better results in terms of number of selected features, classification accuracy, and running time than most existing algorithms. Improvements/Applications: A new feature selector using minimum variance method is implemented and found that it performs better than the popular and computationally expensive traditional algorithms.

Keywords


Classification, Data Mining, Dimensionality Reduction, Feature Selection, Information Gain, Minimum Variance Method



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i4%2F130386