Refine your search
Collections
Co-Authors
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Nedunchezhian, R.
- A Fast Boosting based Incremental Genetic Algorithm for Mining Classification Rules in Large Datasets
Abstract Views :349 |
PDF Views:2
Authors
Affiliations
1 Department of CSE, Park College of Engineering and Technology, Coimbatore, IN
2 Department of CSE, Kalaignar Karunanidhi Institute of Technology, Coimbatore, IN
1 Department of CSE, Park College of Engineering and Technology, Coimbatore, IN
2 Department of CSE, Kalaignar Karunanidhi Institute of Technology, Coimbatore, IN
Source
Software Engineering, Vol 6, No 5 (2014), Pagination: 137-141Abstract
Genetic algorithm is a search technique purely based on natural evolution process. It is widely used by the data mining community for classification rule discovery in complex domains. During the learning process it makes several passes over the data set for determining the accuracy of the potential rules. Due to this characteristic it becomes an extremely I/O intensive slow process. It is particularly difficult to apply GA when the training data set becomes too large and not fully available. An incremental Genetic algorithm based on boosting phenomenon is proposed in this paper which constructs a weak ensemble of classifiers in a fast incremental manner and thus tries to reduce the learning cost considerably.Keywords
Classification, Incremental Learning, Genetic Algorithm (Ga), Scalability, Boosting.- An Alternative Extension of the K-Means Algorithm for Clustering Medical Data
Abstract Views :259 |
PDF Views:2
Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Recently, the problem of clustering categorical data has started drawing interest. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The k-means algorithm is well known for its efficiency in this respect. At the same time, working only on numerical data prohibits them from being used for clustering categorical data. The main contribution of this is to show how to apply the notion of “cluster centers” on a dataset of categorical objects and how to use this notion for formulating the clustering problem of categorical objects as a partitioning problem. Finally, a k-means-like algorithm for clustering categorical data is introduced. The clustering performance of the algorithm is demonstrated with well-known medicine data sets.
Authors
Affiliations
1 Department of Computer Science and Engineering, Kalaignar Karunanidhi Institute of Technology, Coimbatore, IN
2 Department of Master of Computer Applications, PSG College of Arts and Science, Coimbatore, IN
1 Department of Computer Science and Engineering, Kalaignar Karunanidhi Institute of Technology, Coimbatore, IN
2 Department of Master of Computer Applications, PSG College of Arts and Science, Coimbatore, IN
Source
Data Mining and Knowledge Engineering, Vol 1, No 8 (2009), Pagination: 375-382Abstract
Data clustering is a very powerful technique in many application areas. Not only may the clusters have meaning themselves, but clustering allows for efficient data management techniques in that data that is grouped in the same manner will usually be accessed together. Access to data within a cluster may predict that other data in that cluster will be accessed soon; this can lead to optimized storage strategies which perform much better than if the data were randomly stored.Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Recently, the problem of clustering categorical data has started drawing interest. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The k-means algorithm is well known for its efficiency in this respect. At the same time, working only on numerical data prohibits them from being used for clustering categorical data. The main contribution of this is to show how to apply the notion of “cluster centers” on a dataset of categorical objects and how to use this notion for formulating the clustering problem of categorical objects as a partitioning problem. Finally, a k-means-like algorithm for clustering categorical data is introduced. The clustering performance of the algorithm is demonstrated with well-known medicine data sets.