Author Details

Scroll

Refine your search

Collections

Engineering Collection

Co-Authors

Journals

Data Mining and Knowledge Engineering

Year

2011
2010

Authors

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All

Ganatra, Amit

Learning Using Heterogeneous Classifier in Data Mining

Improved K-Means with Dimensionality Reduction Technique

Abstract Views :180 | PDF Views:3

Authors

Amit Thakkar ¹, Nikita Bhatt ¹, Amit Ganatra ¹, Arpita Shah ¹

Affiliations
1 Charotar Institute of Technology Changa, Nadiad, Gujarat, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 12 (2011), Pagination: 722-725

Abstract

Clustering is the process of finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups. K-means is a well known partitioning based clustering technique that attempts to find a user specified number of clusters represented by their centroid. K-means clustering algorithm often does not work well for high dimension; hence, to improve the efficiency, we apply PCA, dimensionality reduction technique, on data set and obtain a reduced dataset containing possibly uncorrelated variables. The challenging task for any clustering method is to determine the number of clusters beforehand. To find the number of cluster, we apply EM method that finds number of clusters user should choose by determining a mixture of Gaussians that fit a given data set. Finally the experiment results shows that the use of techniques such as PCA and EM, improve the efficiency of K-means clustering.

Keywords

Cluster, EM, K-Mean, PCA.

Full Text

Scientific Understanding, Experimental Analysis and a Survey on Evolution of Classification Rule Mining Based on Ant Colony Optimization

Abstract Views :181 | PDF Views:3

Authors

Nidhi Shah ¹, Amit Ganatra ¹, C. K. Bhensdadia ², Y. P. Kosta ¹

Affiliations
1 Department of Computer Engineering CIT-Changa, Gujarat, IN
2 Department of Computer Engineering, Dharmsinh Desai University Nadiad, Gujarat, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 2 (2011), Pagination: 82-89

Abstract

Given the explosive rate of data deposition on the web; classification has become a complex and dynamic phenomenon. As classification complexity is continuing to grow, so is the need in direct proportion to designing and developing data mining algorithms & techniques. Classification is the most commonly applied data mining technique, a process of finding a set of models or functions that describes and distinguishes data classes, for the purpose of using it – so classification is a specialist with specialized skills, which is moving toward universality. A classification problem is considered as a supervised learning problem. The aim of the classification task is to discover a kind of relationship between the attributes (input) and class (output), so that the discovered knowledge can be used to predict the class of a new unknown object. Classification of the records or data is done based on the classification rules. Ant colony optimization is a method that derives its inspiration from real ants that forage for food by selecting the shortest path from multiple possible paths available to reach food. Thus merging the concept of Ant Colony Optimization (ACO) with data mining brings in a new approach to designing classification rule that will be helpful in extraction of information for a specialized dataset. In this paper a survey is done on Ant-miner algorithm for classification Rule extraction. The Ant miner algorithm extract classification rule from data using if-then-else pattern; similar to other traditional algorithm available for classification task or purposes. Extraction of classification Rule from data is an important task of data mining. We present, detailed description about the algorithm available for classification rule mining using Ant colony optimization. Variations to the ant colony based an Ant-miner algorithm is discussed along with the comparison of the algorithms with critical parameters like predictive accuracy, No. of Rules Discovered, No. of terms per No. of rules Discovered, using different data sets. Hence the paper will help to study various ant miner algorithms and comparison carried out will help the data miner to select and use algorithm according to need based on the specialized properties associated with the algorithm.

Keywords

Ant Colony Optimization (ACO), Classification, Data Mining.

Full Text

Incremental Discretization for Naïve Bayes Learning with Optimum Binning

Abstract Views :167 | PDF Views:3

Authors

Kamal Sutaria ¹, Amit Ganatra ², Y. P. Kosta ¹, C. K. Bhensdadia ³, Kruti Khalpada ¹

Affiliations
1 Charotar University of Science and Technology, Changa, Gujrat, IN
2 Charotar University of Science and Technology Changa, Gujrat, IN
3 Department of Computer Engineering, Dharamsinh Desai University, Nadiad, Gujarat, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 4 (2011), Pagination: 266-271

Abstract

Incremental Flexible Frequency Discretization (IFFD) is a recently proposed discretization approach for Naïve Bayes (NB).IFFD performs satisfactory by setting the minimal interval frequency for discretized intervals as a fixed number. In this paper, we first argue that this setting cannot guarantee that the selecting MinBinSize is on always optimal for all the different datasets. So the performance of Naïve Bayes is not good in terms of classification error. We thus proposed a sequential search method for NB: named Optimum Binning. Experiments were conducted on 4 datasets from UCI machine learning repository and performance was compared between NB trained on the data discretized by OB, IFFD, and PKID.

Keywords

Discretization, Naïve Bayes, Optimum Binning.

Full Text

Classification using Generalization Based Decision Tree Induction along with Relevance Analysis Based on Relational Database

Abstract Views :196 | PDF Views:3

Authors

Amit Thakkar ¹, Yogeshwar P. Kosta ², Amit Ganatra ²

Affiliations
1 Charotar Institute of Technology Changa, Gujarat, IN
2 Charotar Institute of Technology, Changa, Gujarat, IN

Source

Data Mining and Knowledge Engineering, Vol 2, No 10 (2010), Pagination: 287-293

Abstract

Classification is a process of sorting unknown values of certain attributes-of-interest based on the values of other attributes, and is a major challenge in data mining. A commonly used method is the decision tree. The efficiency of decision tree algorithms has been well established for relatively small data sets. However, this method of classification has problems when handling larger data sets, data having continuous numerical values, and has the tendency to favor multiplicity in terms of values associated with the attributes in the data set while making selection of the final determining attribute. In data mining applications, large training sets are common; therefore decision tree algorithms have limitations of scalability. Also in most data mining application, users have a little knowledge regarding which signature attribute should be selected for effective mining and the user is more dependent upon the capability of the algorithm. In this paper, we address selection of two things, one, the right signature attribute and the second, handle large data set. This we accomplish by proposing a new data classification method through integration of a set of sequential process that involves steps such as data cleaning; attribute oriented induction (identifying the signature attribute), relevance analysis as the preprocessing steps followed by induction of decision trees. This stepwise approach helps us to set simple extraction rules at multiple levels of abstraction and easily handles large data sets and continuous numerical values in a scalable way.

Keywords

Data Mining, Classification, Data Cleaning, Decision Tree Induction, Relevance Analysis.

Username
Password
Remember me