Open Access Open Access  Restricted Access Subscription Access

A Semisupervised Cascade Classification Algorithm


Affiliations
1 Department of Mathematics, University of Patras, 26504 Rio, Greece
2 Department of Electrical and Computer Engineering, University of Patras, 26504 Rio, Greece
 

Classification is one of the most important tasks of data mining techniques, which have been adopted by several modern applications. The shortage of enough labeled data in the majority of these applications has shifted the interest towards using semisupervised methods. Under such schemes, the use of collected unlabeled data combined with a clearly smaller set of labeled examples leads to similar or even better classification accuracy against supervised algorithms, which use labeled examples exclusively during the training phase. A novel approach for increasing semisupervised classification using Cascade Classifier technique is presented in this paper.Themain characteristic of Cascade Classifier strategy is the use of a base classifier for increasing the feature space by adding either the predicted class or the probability class distribution of the initial data. The classifier of the second level is supplied with the new dataset and extracts the decision for each instance. In this work, a self-trained NB∇C4.5 classifier algorithm is presented, which combines the characteristics ofNaive Bayes as a base classifier and the speed of C4.5 for final classification. We performed an in-depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique has better accuracy in most cases.
User
Notifications
Font Size

Abstract Views: 103

PDF Views: 2




  • A Semisupervised Cascade Classification Algorithm

Abstract Views: 103  |  PDF Views: 2

Authors

Stamatis Karlos
Department of Mathematics, University of Patras, 26504 Rio, Greece
Nikos Fazakis
Department of Electrical and Computer Engineering, University of Patras, 26504 Rio, Greece
Sotiris Kotsiantis
Department of Mathematics, University of Patras, 26504 Rio, Greece
Kyriakos Sgarbas
Department of Electrical and Computer Engineering, University of Patras, 26504 Rio, Greece

Abstract


Classification is one of the most important tasks of data mining techniques, which have been adopted by several modern applications. The shortage of enough labeled data in the majority of these applications has shifted the interest towards using semisupervised methods. Under such schemes, the use of collected unlabeled data combined with a clearly smaller set of labeled examples leads to similar or even better classification accuracy against supervised algorithms, which use labeled examples exclusively during the training phase. A novel approach for increasing semisupervised classification using Cascade Classifier technique is presented in this paper.Themain characteristic of Cascade Classifier strategy is the use of a base classifier for increasing the feature space by adding either the predicted class or the probability class distribution of the initial data. The classifier of the second level is supplied with the new dataset and extracts the decision for each instance. In this work, a self-trained NB∇C4.5 classifier algorithm is presented, which combines the characteristics ofNaive Bayes as a base classifier and the speed of C4.5 for final classification. We performed an in-depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique has better accuracy in most cases.