Open Access
Subscription Access
An Empirical Evaluation of Lazy Learning Classifiers for Text Categorization
With the rapid growth of online documents available on the World Wide Web necessitate the task of classifying those documents into semantic categories. Text categorization is the task of automatically classifying the textual documents into a set of predefined categories. In this paper, we report the empirical evaluation of lazy learning classifier such as kNN and its variant like distance weighted kNN and our newly proposed evident theoretic kNN for text categorization task over two benchmark datasets. We observed the superiority of evident theoretic kNN method over others in all experiments we conducted.
Keywords
Text Categorization, Lazy Learning, KNN
User
Information
- Bekkerman R, El-Yaniv R, Tishby N, & Winter Y (2003) Distributional word clusters vs. words for text categorization. J. Machine Learning Res., 3(2), 1182– 1208.
- Cunningham P & Sarah Jane Delany (2007) k-nearest neighbour classifiers. Technical Report UCD-CSI-2007- 4(3), 27.
- Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster-Shafer theory”. IEEE Transactions on Systems, Man and Cybernetics, 25, 804–813.
- Dudani SA (1976) The distance-weighted k-nearestneighbor rule. IEEE Trans.Syst. Man Cyber., 6, 325–327 .
- Forman G (2003) An extensive empirical study of feature selection metrics for text classification. Special issue on variable and feature selection, J. Machine learning Res., 3(3), 1289-1305.
- Lewis D (1997) Reuters-21578 text categorization test collection. dist. 1.0.
- Sebastiani F (2002) Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.
- Smets P & Kennes R (1994) The transferable belief model. Artificial Intelligence,66(1),191–234.
- Umar Sathic Ali P & Jothi Ventakeswaran C (2011) Improved evidence theoretic kNN classifier based on theory of evidence. Intl. J. of Comput. Appl. 15(5), 37–41.
- Wang H & David Bell (2004) Extended k-nearest neighbours based on evidence theory.The Comp. J.,47(3), 662–672.
- Yang Y & Pedersen JO (1997) A comparative study on feature selection in text categorization: in Proceedings of the 14th Intl. Conf. on Machine Learning, Nashville, TN, 412–420.
Abstract Views: 439
PDF Views: 0