A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Bhensdadia, C. K.
- Empirical Study on Error Correcting Output Code Based on Multiclass Classification
Authors
1 Charotar Institute of Technology Changa, Gujarat, IN
2 Dharmsinh Desai University, Nadiad, Gujarat, IN
3 Charotar Institute of Technology Changa, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 2 (2011), Pagination: 76-81Abstract
A common way to address a multi-class classification problem is to design a model that consists of hand picked binary classifiers and to combine them so as to solve the problem. Error-Correcting Output Codes (ECOC) is one such framework that deals with multi-class classification problems. Recent works in the ECOC domain has shown promising results demonstrating improved performance. Therefore, ECOC framework is a powerful tool to deal with multi-class classification problems. The error correcting ability improve and enhance the generalization ability of the base classifiers. This paper introduces state-of-the-art coding (one-versus-one, one-versus-all, dense random, sparse random, DECOC, forest-ECOC, and ECOC-ONE) and decoding designs (hamming, Euclidean, inverse hamming, laplacian, β-density, attenuated, loss-based, probabilistic kernel-based, and loss weighted) perspectives along with empirical study of ECOC following comparison of various ECOC methods in the above context. Towards the end, our paper consolidates details relating to comparison of various classification methods with Error Correcting Output Code method available in weka, after carrying out experiments with weka tool as a final supplement to our studies.Keywords
Coding, Decoding, Error Correcting Output Codes, Multi-class Classification.- Scientific Understanding, Experimental Analysis and a Survey on Evolution of Classification Rule Mining Based on Ant Colony Optimization
Authors
1 Department of Computer Engineering CIT-Changa, Gujarat, IN
2 Department of Computer Engineering, Dharmsinh Desai University Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 2 (2011), Pagination: 82-89Abstract
Given the explosive rate of data deposition on the web; classification has become a complex and dynamic phenomenon. As classification complexity is continuing to grow, so is the need in direct proportion to designing and developing data mining algorithms & techniques. Classification is the most commonly applied data mining technique, a process of finding a set of models or functions that describes and distinguishes data classes, for the purpose of using it – so classification is a specialist with specialized skills, which is moving toward universality. A classification problem is considered as a supervised learning problem. The aim of the classification task is to discover a kind of relationship between the attributes (input) and class (output), so that the discovered knowledge can be used to predict the class of a new unknown object. Classification of the records or data is done based on the classification rules. Ant colony optimization is a method that derives its inspiration from real ants that forage for food by selecting the shortest path from multiple possible paths available to reach food. Thus merging the concept of Ant Colony Optimization (ACO) with data mining brings in a new approach to designing classification rule that will be helpful in extraction of information for a specialized dataset. In this paper a survey is done on Ant-miner algorithm for classification Rule extraction. The Ant miner algorithm extract classification rule from data using if-then-else pattern; similar to other traditional algorithm available for classification task or purposes. Extraction of classification Rule from data is an important task of data mining. We present, detailed description about the algorithm available for classification rule mining using Ant colony optimization. Variations to the ant colony based an Ant-miner algorithm is discussed along with the comparison of the algorithms with critical parameters like predictive accuracy, No. of Rules Discovered, No. of terms per No. of rules Discovered, using different data sets. Hence the paper will help to study various ant miner algorithms and comparison carried out will help the data miner to select and use algorithm according to need based on the specialized properties associated with the algorithm.Keywords
Ant Colony Optimization (ACO), Classification, Data Mining.- Comprehensive Evolution of Different Methods Used in Data Mining-Based Intrusion Detection System
Authors
1 Charotar Institute of Technology Changa, Gujarat, IN
2 Dharmsinh Desai University, Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 2 (2011), Pagination: 90-98Abstract
Intrusion is defined as an invasion that consists of set-of-actions that compromise upon the integrity, confidentiality or availability of data-resource/s. Therefore, intrusion detection is an important task when dealing with an information infrastructure for security. A major challenge in intrusion detection is to unearth intrusions that happen almost instantaneously and thereafter lay embedded, to be discovered, in vast scattered resources in a normally operating real-time communication environment. Data mining process working on intrusion detection is to identify valid, novel, potentially useful, and ultimately understandable patterns in massive data. Thus, it can be understood that, it is challenging as well as demanding to apply data mining techniques to detect intrusions of various types in an information infrastructure resource/s. To start with, our paper discusses different intrusion detection techniques that brings out and presents the underlying concepts and associated application of data mining approaches as an applied tool against intrusion detection system. Techniques include, Support Vector Machines (SVMs) that was designed and utilized as classifiers for binary classification/s, and helped to solve multi-class problems. In this paper we bring in the fusion of Decision-Tree and Support Vector Machine (DT-SVM) which combines and reinforce in an effective way for solving multi-class problems in the information resource domain. This method has the potential, as confirmed in our findings, to decrease the training and testing time, contributing to increased efficiency of the system. The construction order of binary tree significantly influences classification performance. Towards the end of the paper we report aspects relating to development of an algorithm that combines to produce a Tree structured multi-class SVM as an intrusion detection data mining technique, which has been applied successfully for the purpose of classifying data that aid the process of intrusion detection.Keywords
Ant-Miner, COD (Common Outlier Detection), Decision Tree, Fuzzy C-Means, K-Means, MACO, Support Vector Machine (SVM) and Decision-Tree and Support Vector Machine (DT-SVM).- Support Vector Machine Classification Methods:A Review and Comparison with Different Classifiers
Authors
1 Department of Computer Engineering, Dharmsinh Desai University, Nadiad, Gujarat, IN
2 Charotar University of Science Technology (CHARUSAT), Education Campus, Changa, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 1 (2011), Pagination: 45-52Abstract
Support Vector Machines (SVMs) have been extensively researched in the data mining and machine learning communities for the last decade and actively applied to applications in various domains. SVMs are typically used for learning classification and regression tasks. Two special properties of SVMs are that they achieve (1) high generalization by maximizing the margin and (2) support an efficient learning of nonlinear functions by kernel trick. Many algorithms and their improvements have been proposed to train SVMs. This paper presents a comprehensive description of various SVM methods and compares SVM classifier with other classification methods.
Keywords
Classifiers, Machine Learning, Predictive Accuracy, Support Vector Machine (SVM).- Scientific Understanding, Comprehensive Evolution and More Informed Evaluation of Various Sequential Pattern Mining Algorithms
Authors
1 Dharamsinh Desai University, Nadiad, Gujarat, IN
2 Dharamsinh Desai Institute of Technology, Professor and Head of Department, IN
3 CHARUSET University, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 1 (2011), Pagination: 53-59Abstract
As database resources get complex and bulky it not only becomes difficult to access specific information but also to extract relevant information from them. A way to address this issue is through sequential pattern mining technique. Sequential pattern mining is new trend in the domain of data mining and has many useful and exciting applications. In the sequential pattern mining approach, we mainly deal with attempting to discover a pattern that is sequential in nature. This helps us to predicting next event after a sequence or sequence-of-event(s). The success of such techniques lies in the design of their algorithm. Today, there are several competitive and efficient algorithms that cope with the popular and computationally expensive task of sequential pattern mining. Actually, these algorithms are more or less described on their own. This paper mainly focuses on the need, merits and demerits of different sequential rule mining algorithms and categorizing them according to their mining method, search method adopted, database formatting employed and other constraints as applied to the database. The basic inspiration to undertake this study is to provide a single platform-of-information that will serve as a ready reference for both the researchers and practitioners interested in the designing and implementation of sequential pattern mining algorithms depending upon categorized databases.Keywords
Sequential Pattern Mining, Database Formatting, Mining with Constraints, Pattern-Growth Method.- Incremental Discretization for Naïve Bayes Learning with Optimum Binning
Authors
1 Charotar University of Science and Technology, Changa, Gujrat, IN
2 Charotar University of Science and Technology Changa, Gujrat, IN
3 Department of Computer Engineering, Dharamsinh Desai University, Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 4 (2011), Pagination: 266-271Abstract
Incremental Flexible Frequency Discretization (IFFD) is a recently proposed discretization approach for Naïve Bayes (NB).IFFD performs satisfactory by setting the minimal interval frequency for discretized intervals as a fixed number. In this paper, we first argue that this setting cannot guarantee that the selecting MinBinSize is on always optimal for all the different datasets. So the performance of Naïve Bayes is not good in terms of classification error. We thus proposed a sequential search method for NB: named Optimum Binning. Experiments were conducted on 4 datasets from UCI machine learning repository and performance was compared between NB trained on the data discretized by OB, IFFD, and PKID.