Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Sef-Usief Feature Selector: An Approach to Select Effective Features and Unselect Ineffective Features


Affiliations
1 Quality Assurance and Program Review, New Era College, Botswana
2 Faculty of Health and Education, Botho University, Botswana
     

   Subscribe/Renew Journal


Feature selection is a method in Data mining to reduce the features from the original dataset by removing the noisy features from the dataset to improve the performance of the classifiers in terms of improving the prediction accuracy. Classification tasks often include, among the large number of features to be processed in the datasets, many irrelevant and redundant ones, which can even decrease the efficiency of classifiers. Feature Selection (FS) is the most common pre-processing technique utilized to overcome the drawbacks of the high dimensionality of datasets. The proposed SEF-USIEF Feature Selector: An approach to Select Effective Features and Unselect Ineffective Feature methods increases the performance of the classification methods by eliminating the irrelevant features from the dataset. This proposed SEF-USIEF method is implemented for the numerical datasets. This method is derived from the ward Minimum Variance cluster method and in this experiment Minimum Variance is used as the feature selector method. The numerical datasets are obtained from the UCI repository and WebKB repository. The results obtained by the proposed SEF-USIEF method are compared with the existing feature selection methods to analyze whether the features are reduced by the SEF-USIEF method or not. Then, features are given as input to the classifiers to check whether the classifier performance is increased or not. Based on the compared analysis of the results, the SEF-USIEF method proved that the performance of the classifiers is increased, and also the selected features are reduced when compared to the existing feature selection methods.

Keywords

Feature Selection, Data Mining, Classifiers, Minimum Variance Cluster Method, Minimum Variance Feature Selector Method
Subscription Login to verify subscription
User
Notifications
Font Size

  • D. Dua and C. Graff, “UCI Machine Learning. Repository”, Available at http://archive.ics.uci.edu/ml, Accessed at 2019.
  • N. Elssied and A. Osman, “A Novel Feature Selection Based on One-Way ANOVA F-Test for E-Mail Spam Classification”, Research Journal of Applied Sciences, Engineering and Technology, Vol. 12, pp. 625-638, 2014.
  • A. Galathiya and C. Bhensdadia, “Improved Decision Tree Induction Algorithm with Feature Selection, Cross Validation, Model Complexity and Reduced Error Pruning”, International Journal of Computer Science, Vol. 3, No. 2, pp. 3427-3431, 2012.
  • P. Jamshid and O. Mohammad Hossein, “An Efficient Hybrid Filter-Wrapper Metaheuristic-Based Gene Selection Method for High Dimensional Datasets”, Scientific Reports, Vol. 9, pp. 1-15, 2019.
  • A. Jovic and N. Bogunovic, “A Review of Feature Selection Methods with Applications”, Proceedings of International Convention on Information and Communication Technology, Electronics and Microelectronics, pp. 1-7, 2015.
  • K.H. Keerthi and B. Harish, “A New Feature Selection Method for Sentiment Analysis in Short Text”, Journal of Intelligent Systems, Vol. 12, pp. 1122-1134, 2020.
  • T. Khawla and Z. Azeddine, “Feature Selection Methods and Genomic Big Data: A Systematic Review”, Journal of Big Data, Vol. 12, No. 2, pp. 1-14, 2019.
  • K.I. Ludmila and F.J. William, “PCA Feature Extraction for Change Detection in Multidimensional Unlabeled Data”, IEEE Transactions on Neural Networks and Learning Systems, Vol. 25, No. 1, pp. 69-80, 2014.
  • T. Marwa, “A New Feature Selection Method for Nominal Classifier based on Formal Concept Analysis”, Procedia Computer Science, Vol. 78, pp. 186-194, 2017.
  • N. Mehdi, B. Amir-Masoud and V. Touraj, “A Hybrid Feature Selection Method to Improve Performance of a Group of Classification Algorithms”, International Journal of Computer Applications, Vol. 69, No. 17, pp. 1-13, 2013.
  • R. Mehrdad and F. Saman, “A Novel Community Detection Based Genetic Algorithm for Feature Selection”. Machine Learning, Vol. 13, No. 1, pp. 1-16, 2020.
  • C. Nicole and B. Pierre, “New Feature Selection Method based on Neural Network and Machine Learning”, IEEE International Multidisciplinary Conference on Engineering Technology, pp. 81-85, 2016.
  • D. Pijush and R. Susanta, “sigFeature: Novel Significant Feature Selection Method for Classification of Gene Expression Data using Support Vector Machine and T Statistic”, Journal Frontiers in Genetics, Vol. 14, No. 1, pp. 1-14, 2020.
  • E.D. Preetha, G. Deepti Raj and T. Rajendran, “Feature Subset Selection for Irrelevant Data Removal using Decision Tree Algorithm”, Proceedings of International Conference on Advanced Computing, pp. 268-274, 2013.
  • R. Radha and S. Muralidhara, “Removal of Redundant and Irrelevant Data from Training Datasets using Speedy Feature Selection method”, International Journal of Computer Science and Mobile Computing, Vol. 15, pp. 359-364, 2016.
  • B. Raja and T. Babu, “A Novel Feature Selection Based Ensemble Decision Tree Classification Model for Predicting
  • Severity Level of COPD Disease”, Biomedical and Pharmacology Journal, Vol. 23, No. 1, pp.1-18, 2019.
  • P. Selwyn, “Evaluating Feature Selection Methods for Learning in Data Mining Applications”, Proceedings of International Conference on Computing, Artificial Intelligence and Information Technology, pp. 483-494, 2006.
  • C. Silvia and V. Marco, “A Hybrid Feature Selection Method for Classification Purposes”, Proceedings of International Conference on European Modelling, pp. 1-13, 2016.
  • C. Sunyoung and K. Hyuntaek, K., “Automatic Recognition of Alzheimer’s Disease using Genetic Algorithms and Neural Network”, Lecture Notes in Computer Science, pp. 695-702, 2003.
  • M.S. Suresh and N. Athi, “Improving Classification Accuracy using Combined Filter+Wrapper Feature Selection Technique”, Proceedings of International Conference on Electrical, Computer and Communication Technologies, pp. 1-6, 2019.
  • P. Xiaoqing and C. Yaokai, “Hybrid Feature Selection Model based on Machine Learning and Knowledge Graph”, Journal of Physics: Conference Series, Vol. 23, No. 1, pp. 1-14, 2021.
  • P. Yonghong and J. Jianmin, “A Novel Feature Selection Approach for Biomedical Data Classification”, Journal of Biomedical Information, Vol. 23, No. 1, pp. 15-23, 2010.
  • Zsolt János, V., Krisztián Balázs, K,, Ádám, F., and Máté István, B, “Adaptive, Hybrid Feature Selection (AHFS)”, Pattern Recognition. Volume 116, 2021.

Abstract Views: 91

PDF Views: 2




  • Sef-Usief Feature Selector: An Approach to Select Effective Features and Unselect Ineffective Features

Abstract Views: 91  |  PDF Views: 2

Authors

Subitha Sivakumar
Quality Assurance and Program Review, New Era College, Botswana
Sivakumar Venkataraman
Faculty of Health and Education, Botho University, Botswana
Asherl Bwatiramba
Faculty of Health and Education, Botho University, Botswana

Abstract


Feature selection is a method in Data mining to reduce the features from the original dataset by removing the noisy features from the dataset to improve the performance of the classifiers in terms of improving the prediction accuracy. Classification tasks often include, among the large number of features to be processed in the datasets, many irrelevant and redundant ones, which can even decrease the efficiency of classifiers. Feature Selection (FS) is the most common pre-processing technique utilized to overcome the drawbacks of the high dimensionality of datasets. The proposed SEF-USIEF Feature Selector: An approach to Select Effective Features and Unselect Ineffective Feature methods increases the performance of the classification methods by eliminating the irrelevant features from the dataset. This proposed SEF-USIEF method is implemented for the numerical datasets. This method is derived from the ward Minimum Variance cluster method and in this experiment Minimum Variance is used as the feature selector method. The numerical datasets are obtained from the UCI repository and WebKB repository. The results obtained by the proposed SEF-USIEF method are compared with the existing feature selection methods to analyze whether the features are reduced by the SEF-USIEF method or not. Then, features are given as input to the classifiers to check whether the classifier performance is increased or not. Based on the compared analysis of the results, the SEF-USIEF method proved that the performance of the classifiers is increased, and also the selected features are reduced when compared to the existing feature selection methods.

Keywords


Feature Selection, Data Mining, Classifiers, Minimum Variance Cluster Method, Minimum Variance Feature Selector Method

References