Open Access Open Access  Restricted Access Subscription Access

An Empirical Evaluation of Lazy Learning Classifiers for Text Categorization


Affiliations
1 Department of Computer Science, Bharathiar University, Coimbatore, India
2 Department of Computer Science, Presidency College, Chennai, India
 

With the rapid growth of online documents available on the World Wide Web necessitate the task of classifying those documents into semantic categories. Text categorization is the task of automatically classifying the textual documents into a set of predefined categories. In this paper, we report the empirical evaluation of lazy learning classifier such as kNN and its variant like distance weighted kNN and our newly proposed evident theoretic kNN for text categorization task over two benchmark datasets. We observed the superiority of evident theoretic kNN method over others in all experiments we conducted.

Keywords

Text Categorization, Lazy Learning, KNN
User
Notifications

  • Bekkerman R, El-Yaniv R, Tishby N, & Winter Y (2003) Distributional word clusters vs. words for text categorization. J. Machine Learning Res., 3(2), 1182– 1208.
  • Cunningham P & Sarah Jane Delany (2007) k-nearest neighbour classifiers. Technical Report UCD-CSI-2007- 4(3), 27.
  • Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster-Shafer theory”. IEEE Transactions on Systems, Man and Cybernetics, 25, 804–813.
  • Dudani SA (1976) The distance-weighted k-nearestneighbor rule. IEEE Trans.Syst. Man Cyber., 6, 325–327 .
  • Forman G (2003) An extensive empirical study of feature selection metrics for text classification. Special issue on variable and feature selection, J. Machine learning Res., 3(3), 1289-1305.
  • Lewis D (1997) Reuters-21578 text categorization test collection. dist. 1.0.
  • Sebastiani F (2002) Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.
  • Smets P & Kennes R (1994) The transferable belief model. Artificial Intelligence,66(1),191–234.
  • Umar Sathic Ali P & Jothi Ventakeswaran C (2011) Improved evidence theoretic kNN classifier based on theory of evidence. Intl. J. of Comput. Appl. 15(5), 37–41.
  • Wang H & David Bell (2004) Extended k-nearest neighbours based on evidence theory.The Comp. J.,47(3), 662–672.
  • Yang Y & Pedersen JO (1997) A comparative study on feature selection in text categorization: in Proceedings of the 14th Intl. Conf. on Machine Learning, Nashville, TN, 412–420.

Abstract Views: 439

PDF Views: 0




  • An Empirical Evaluation of Lazy Learning Classifiers for Text Categorization

Abstract Views: 439  |  PDF Views: 0

Authors

P. Umar Sathic Ali
Department of Computer Science, Bharathiar University, Coimbatore, India
C. Jothi Venkateswaran
Department of Computer Science, Presidency College, Chennai, India

Abstract


With the rapid growth of online documents available on the World Wide Web necessitate the task of classifying those documents into semantic categories. Text categorization is the task of automatically classifying the textual documents into a set of predefined categories. In this paper, we report the empirical evaluation of lazy learning classifier such as kNN and its variant like distance weighted kNN and our newly proposed evident theoretic kNN for text categorization task over two benchmark datasets. We observed the superiority of evident theoretic kNN method over others in all experiments we conducted.

Keywords


Text Categorization, Lazy Learning, KNN

References