Author Details

Scroll

Refine your search

Collections

Engineering Collection

Co-Authors

Journals

Data Mining and Knowledge Engineering

Year

Authors

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All

Punithavalli, M.

An Emerging Classification Method for Huge Dataset in Clustering

Abstract Views :242 | PDF Views:2

Authors

B. Rosiline Jeetha ¹, M. Punithavalli ²

Affiliations
1 School of Computer Studies (PG), RVS College of Arts and Science, Coimbatore, IN
2 Department of Computer Science, SNS Raja Lakshmi College of Arts and Science, Coimbatore, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 10 (2011), Pagination: 599-601

Abstract

Clustering analysis is used to explore the classification for large dataset and Canberra distance is generalized so that it can process the data with categorical attributes. Based on the generalized Canberra distance definition, an instance of constraint-based clustering is introduced. Meanwhile, the nearest neighbor classification is improved. Class-labeled clusters are regarded as classifying models used for classifying data. The proposed classification method can discover the data of big difference from the instances in training data, which may mean a new data type. The generalize Canberra distance for continuous numerical attributes data to mixed attributes data, and use clustering analysis technique to squash existing instances, improve the classical nearest neighbor classification method.

Keywords

ID3, C4.5, Canberra Distance, Clustering, Improved Nearest Neighbour.

Full Text

A Survey on Classification Methods Based on Decision Tree Algorithms in Data Mining

Abstract Views :185 | PDF Views:1

Authors

B. Rosiline Jeetha ¹, M. Punithavalli ²

Affiliations
1 Bharathiar University, Coimbatore, IN
2 Department of Computer Science, SNS Raja Lakshmi College of Arts and Science, Coimbatore, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 4 (2011), Pagination: 207-210

Abstract

Data mining resides in the junction of traditional statistics and computer science. As distinct from statistics, data mining is more about searching for hypotheses in data that happens to be available instead of verifying research hypotheses by collecting data from designed experiments. Data mining is also characterized as being oriented toward problems with a large number of variables and/or samples that makes scaling up algorithms important. This means developing algorithms with low computational complexity, using parallel computing, partitioning the data into subsets, or finding effective ways to use relational data bases. The process- and utility-centered thinking in data mining and knowledge discovery is manifested also in the reported, commercial systems. Decision Trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. The technology for building Knowledge based system by decision tree algorithms has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in variety of systems, and it describes such system ID3, C4.5 and CART. Results from recent studies show ways in which the methodology can be modified to deal with information that is noisy and/or incomplete.

Keywords

Decision Tree, ID3, C4.5 and CART.

Full Text

An Enhanced Projected Clustering Algorithm for High Dimensional Space

Abstract Views :200 | PDF Views:1

Authors

B. Shanmugapriya ¹, M. Punithavalli ², G. Selvavinayagam ³

Affiliations
1 Department of Computer Science, Sri Ramakrishna College of Arts and Science for Women, Coimbatore, IN
2 Department of Computer Science Dr.SNS College of Arts and Science, Coimbatore, IN
3 Department of Computer Science and Engineering, Park College of Engineering & Technology, Coimbatore, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 2 (2011), Pagination: 104-109

Abstract

Clustering is a data mining technique for identifying groups in the data set based on some similarity measure. Clustering high dimensional data has been a major challenge due to the inherent sparsity of the points. Most existing clustering algorithms become substantially inefficient if the required similarity measure is computed between data points in the full dimensional space. A number of projected clustering algorithms have been proposed to overcome the above issue. This led to the development of a robust partitional distance based projected clustering algorithm based on K-means algorithm with the computation of distance restricted to subsets of attributes with dense object values. The algorithm is capable of detecting projected clusters of low dimensionality embedded in a high-dimensional space and avoids the computation of the distance in full-dimensional space. The algorithm has been demonstrated using synthetic and real datasets.

Keywords

Clustering, High Dimensional Data, Projected Cluster, K-Means Clustering, Subspace Clustering.

Full Text

A Survey on Clustering Algorithms

A Survey on Data Clustering Algorithms

Abstract Views :208 | PDF Views:2

Authors

R. Shanmugasundaram ¹, M. Punithavalli ²

Affiliations
1 Department of Computer Science, Erode Arts & Science College, Erode, Tamil Nadu, IN
2 Department of Computer Science, Sri Ramakrishna College of Arts and Science for Women, Coimbatore, IN

Source

Data Mining and Knowledge Engineering, Vol 1, No 8 (2009), Pagination: 421-425

Abstract

Clustering is a significant area of application for a range of fields including data mining, statistical data analysis, image compression, and vector quantization. Moreover Clustering has been formulated in different manners in machine learning, pattern recognition, optimization, and statistics literature. The basic problem in clustering arise at grouping together (clustering) data streams which are analogous to each other. A variety of algorithms have emerged that meet the requirements and were successfully applied to real-life data clustering problems. This paper makes a general survey on various Clustering algorithms that have been proposed earlier in literature. In addition the future enhancement section of this paper suggests some of the modifications of earlier proposed work to overcome their limitations.

Keywords

Clustering, Data Mining, Image Compression, Machine Learning, Optimization, Pattern Recognition, Statistical Data Analysis, Vector Quantization.

Full Text

Software Tool for Agent Based Distributed Data Mining

Abstract Views :152 | PDF Views:2

Authors

K. Anandakumar ¹, M. Punithavalli ²

Affiliations
1 Computer Applications Department, Dr. SNS Rajalakshmi College of Arts and Science, Coimbatore, IN
2 Computer Science Department, Sri Ramakrishna College of Arts and Science for Women, Coimbatore, IN

Source

Data Mining and Knowledge Engineering, Vol 1, No 1 (2009), Pagination: 33-39

Abstract

The main objective of this project is to illustrate the maximum utilization of available resources for the data mining activities. Mining information and knowledge from huge data sources such as Weather databases, financial data portals or emerging disease information systems has been recognized by industrial companies as an important area with an opportunity of major revenues from applications such as business data warehousing, process control, and personalized on-line customer services over Internet and web. Distributed Data mining is expected to perform partial analysis of data at clients and then to send the outcome as results to the server where it is sometimes required to be aggregated to the global result. The primary issues to be considered for DDM are Scalability, privacy of data and autonomy of data. These issues can be easily handled when we go for intelligent software agents for Distributed Data mining, because of its inherent features of being autonomous, capable of adaptive and deliberative reasoning.

Keywords

Data Mining, Frequent Item Set, Distributed Data Mining.

Username
Password
Remember me