A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Ganatra, Amit
- Learning Using Heterogeneous Classifier in Data Mining
Authors
1 Chandubhai S Patel Institute of Technology Changa, Gujarat, IN
2 Chandubhai S Patel Institute of Technology, Changa, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 13 (2011), Pagination: 788-792Abstract
Data Mining can be considered an analytic process designed to explore business or market data to search for consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. Data mining is useful for prediction. We can improve accuracy of different classifiers by combining various classifiers and taking their predictions. One such method is Stacking, an ensemble method in which a number of base classifiers are combined using one meta-classifier which learns their outputs. This enhances the benefits obtained by individual classifiers. This paper is a review work of different approaches proposed by various authors in their paper.Keywords
Ensemble of Classifiers, Bagging, Boosting, Staking, Troika.- Improved K-Means with Dimensionality Reduction Technique
Authors
1 Charotar Institute of Technology Changa, Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 12 (2011), Pagination: 722-725Abstract
Clustering is the process of finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups. K-means is a well known partitioning based clustering technique that attempts to find a user specified number of clusters represented by their centroid. K-means clustering algorithm often does not work well for high dimension; hence, to improve the efficiency, we apply PCA, dimensionality reduction technique, on data set and obtain a reduced dataset containing possibly uncorrelated variables. The challenging task for any clustering method is to determine the number of clusters beforehand. To find the number of cluster, we apply EM method that finds number of clusters user should choose by determining a mixture of Gaussians that fit a given data set. Finally the experiment results shows that the use of techniques such as PCA and EM, improve the efficiency of K-means clustering.Keywords
Cluster, EM, K-Mean, PCA.- Scientific Understanding, Experimental Analysis and a Survey on Evolution of Classification Rule Mining Based on Ant Colony Optimization
Authors
1 Department of Computer Engineering CIT-Changa, Gujarat, IN
2 Department of Computer Engineering, Dharmsinh Desai University Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 2 (2011), Pagination: 82-89Abstract
Given the explosive rate of data deposition on the web; classification has become a complex and dynamic phenomenon. As classification complexity is continuing to grow, so is the need in direct proportion to designing and developing data mining algorithms & techniques. Classification is the most commonly applied data mining technique, a process of finding a set of models or functions that describes and distinguishes data classes, for the purpose of using it – so classification is a specialist with specialized skills, which is moving toward universality. A classification problem is considered as a supervised learning problem. The aim of the classification task is to discover a kind of relationship between the attributes (input) and class (output), so that the discovered knowledge can be used to predict the class of a new unknown object. Classification of the records or data is done based on the classification rules. Ant colony optimization is a method that derives its inspiration from real ants that forage for food by selecting the shortest path from multiple possible paths available to reach food. Thus merging the concept of Ant Colony Optimization (ACO) with data mining brings in a new approach to designing classification rule that will be helpful in extraction of information for a specialized dataset. In this paper a survey is done on Ant-miner algorithm for classification Rule extraction. The Ant miner algorithm extract classification rule from data using if-then-else pattern; similar to other traditional algorithm available for classification task or purposes. Extraction of classification Rule from data is an important task of data mining. We present, detailed description about the algorithm available for classification rule mining using Ant colony optimization. Variations to the ant colony based an Ant-miner algorithm is discussed along with the comparison of the algorithms with critical parameters like predictive accuracy, No. of Rules Discovered, No. of terms per No. of rules Discovered, using different data sets. Hence the paper will help to study various ant miner algorithms and comparison carried out will help the data miner to select and use algorithm according to need based on the specialized properties associated with the algorithm.Keywords
Ant Colony Optimization (ACO), Classification, Data Mining.- Incremental Discretization for Naïve Bayes Learning with Optimum Binning
Authors
1 Charotar University of Science and Technology, Changa, Gujrat, IN
2 Charotar University of Science and Technology Changa, Gujrat, IN
3 Department of Computer Engineering, Dharamsinh Desai University, Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 4 (2011), Pagination: 266-271Abstract
Incremental Flexible Frequency Discretization (IFFD) is a recently proposed discretization approach for Naïve Bayes (NB).IFFD performs satisfactory by setting the minimal interval frequency for discretized intervals as a fixed number. In this paper, we first argue that this setting cannot guarantee that the selecting MinBinSize is on always optimal for all the different datasets. So the performance of Naïve Bayes is not good in terms of classification error. We thus proposed a sequential search method for NB: named Optimum Binning. Experiments were conducted on 4 datasets from UCI machine learning repository and performance was compared between NB trained on the data discretized by OB, IFFD, and PKID.
Keywords
Discretization, Naïve Bayes, Optimum Binning.- Classification using Generalization Based Decision Tree Induction along with Relevance Analysis Based on Relational Database
Authors
1 Charotar Institute of Technology Changa, Gujarat, IN
2 Charotar Institute of Technology, Changa, Gujarat, IN