Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Performance Evaluation of Feature Selection Measures


Affiliations
1 Department of Computer Science and Engineering, GITAM University, Hyderabad, Telangana, India
2 Department of Information Technology, GITAM University, Hyderabad, Telangana, India
     

   Subscribe/Renew Journal


Feature Selection is one of the approach for solving dimensionality problem. Numerous feature selection filter evaluation measures are used to produce good feature subset. This paper presents the comparison of Information Gain, Correlation and Gain Ratio filter measures to verify the performance of different filter evaluation measures on high dimensional datasets. Computational time required to evaluate dataset with respect to Naïve Bayes classifier is calculated by using filter measures. Experimental results on different datasets demonstrate that Correlation measure is favourable in terms of computational time than other measures.

Keywords

Feature Selection, Correlation, Information Gain, Filter Measures, Gain Ratio.
Subscription Login to verify subscription
User
Notifications
Font Size


  • Song, Q., Ni, J., & Wang, G. (2013). Fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Transaction on Knowledge and Data Engineering, January, 25(1), 1-14.
  • Guyon, J. I., & Elisseeff, A. (2013). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(1), 1157-1182.
  • Hall, M. A., & Holmes, G. (2003). Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering, 15(3), 1-16.
  • Biesiada, J., & Duch, W. (2008). Feature selection for high-dimensional data Pearson redundancy based filter. Advances in Soft Computing, 45, 242-249.
  • Chormunge, S., & Jena, S. (2015). Efficiency and effectiveness of clustering algorithms for high dimensional data. International Journal of Computer Applications, September, 125(11), 35-40.
  • Bouckaert, R. R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., & Scuse, D. (2013).
  • WEKA Manual for Version 3-7-10.
  • J.R. Quinlan, (1986). Induction of Decision Trees, Machine Learning 1: pp.81-106, Kluwer Academic Publishers, Boston.
  • Han, J., & Kamber, M. (2001). Data Mining: Concepts and Techniques (3rded.). San Francisco, Morgan KauffmANN Publishers.
  • Chormunge, S., & Jena, S. (2016). Efficient feature subset selection algorithm for high dimensional data.
  • International Journal of Electrical and Computer Engineering, 6(4), 1880-1888.
  • Machine Learning & Data Mining Algorithms. Retrieved from http://tunedit.org/repo/Data/Text-wc

Abstract Views: 221

PDF Views: 3




  • Performance Evaluation of Feature Selection Measures

Abstract Views: 221  |  PDF Views: 3

Authors

Smita Chormunge
Department of Computer Science and Engineering, GITAM University, Hyderabad, Telangana, India
Sudarson Jena
Department of Information Technology, GITAM University, Hyderabad, Telangana, India

Abstract


Feature Selection is one of the approach for solving dimensionality problem. Numerous feature selection filter evaluation measures are used to produce good feature subset. This paper presents the comparison of Information Gain, Correlation and Gain Ratio filter measures to verify the performance of different filter evaluation measures on high dimensional datasets. Computational time required to evaluate dataset with respect to Naïve Bayes classifier is calculated by using filter measures. Experimental results on different datasets demonstrate that Correlation measure is favourable in terms of computational time than other measures.

Keywords


Feature Selection, Correlation, Information Gain, Filter Measures, Gain Ratio.

References