Refine your search
Collections
Co-Authors
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Rosiline Jeetha, B.
- A Comparative Study on Hierarchical Clustering in Data Mining
Abstract Views :408 |
PDF Views:3
Profiling in a descriptive task that may be either directed or undirected. In this paper we will review the main methods and approaches of clustering. Clustering is the task of segmenting a heterogeneous population into a number of more homogeneous sub groups or clusters. This survey concentrated on data mining, data mining issues, clusters, clustering, clustering analysis, clustering algorithms, clustering issues, comparison of clustering algorithm, and Requirements of clustering in data mining.
Authors
Affiliations
1 Department of Computer Science, RVS College of Arts & Science, Coimbatore, Tamil Nadu, IN
2 Department of Computer Applications (MCA), RVS College of Arts & Science, Coimbatore, Tamil Nadu, IN
1 Department of Computer Science, RVS College of Arts & Science, Coimbatore, Tamil Nadu, IN
2 Department of Computer Applications (MCA), RVS College of Arts & Science, Coimbatore, Tamil Nadu, IN
Source
Data Mining and Knowledge Engineering, Vol 5, No 12 (2013), Pagination: 470-473Abstract
Data mining is largely concerned with building models. Model is simply an algorithm or set of rules that connects a collection of data (input) to a particular target or outcome. Data mining involves the tasks are classification, estimation, prediction, clustering, affinity grouping, description & profiling. The first 3 are all the examples of directed data mining, where the goal is to find the value of a particular target variable. Affinity grouping and clustering are undirected tasks where the goal is to uncover structure in data without respect to particular target variable.Profiling in a descriptive task that may be either directed or undirected. In this paper we will review the main methods and approaches of clustering. Clustering is the task of segmenting a heterogeneous population into a number of more homogeneous sub groups or clusters. This survey concentrated on data mining, data mining issues, clusters, clustering, clustering analysis, clustering algorithms, clustering issues, comparison of clustering algorithm, and Requirements of clustering in data mining.
Keywords
Data Mining, Clustering, Hierarchical Clustering Algorithm, Agglomerative, Divisive.- An Optimized Approach to Record Deduplication
Abstract Views :181 |
PDF Views:2
Authors
Affiliations
1 Department of Computer Science, R.V.S College of Arts and Science, Sulur, Coimbatore, IN
1 Department of Computer Science, R.V.S College of Arts and Science, Sulur, Coimbatore, IN
Source
Data Mining and Knowledge Engineering, Vol 5, No 3 (2013), Pagination: 85-90Abstract
Record deduplication is a specialized technique for eliminating duplicate copies of repeating record. Duplicate record detection is important for data preprocessing and cleaning. The increasing volume of information available in digital media becomes a challenging problem for data administrators. The increased volume even created redundant data also in the database. So a system or method is become immense to control the redundancy and duplication. Databases are increasing in size at an exponential rate, and it plays an important role in all industry. Detection of duplicate Records in IT industry become is necessary to obtain precise results while searching and to shrink storage requirements. This paper presents the problem of duplicate records and their detection. In the proposed approach, we made a method that makes use of BAT for generating the optimal similarity measure to decide whether the data is duplicate or not. The optimal similarity measure is generated using BAT algorithm for the training datasets. This system is initialized with a population of random solutions and searches for optima by updating bat generations We have used Synthetic datasets to analyze the proposed algorithm and the performance of the proposed algorithm is compared against the genetic programming technique with the help of evaluation metrics. Our Approach makes the user free from the burden of having to choose and tune this parameter.Keywords
BAT Algorithm Data Preprocessing, Duplicate Detection, Data Duplication, Genetic Programming.- An Emerging Classification Method for Huge Dataset in Clustering
Abstract Views :239 |
PDF Views:2
Authors
Affiliations
1 School of Computer Studies (PG), RVS College of Arts and Science, Coimbatore, IN
2 Department of Computer Science, SNS Raja Lakshmi College of Arts and Science, Coimbatore, IN
1 School of Computer Studies (PG), RVS College of Arts and Science, Coimbatore, IN
2 Department of Computer Science, SNS Raja Lakshmi College of Arts and Science, Coimbatore, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 10 (2011), Pagination: 599-601Abstract
Clustering analysis is used to explore the classification for large dataset and Canberra distance is generalized so that it can process the data with categorical attributes. Based on the generalized Canberra distance definition, an instance of constraint-based clustering is introduced. Meanwhile, the nearest neighbor classification is improved. Class-labeled clusters are regarded as classifying models used for classifying data. The proposed classification method can discover the data of big difference from the instances in training data, which may mean a new data type. The generalize Canberra distance for continuous numerical attributes data to mixed attributes data, and use clustering analysis technique to squash existing instances, improve the classical nearest neighbor classification method.Keywords
ID3, C4.5, Canberra Distance, Clustering, Improved Nearest Neighbour.- A Survey on Classification Methods Based on Decision Tree Algorithms in Data Mining
Abstract Views :178 |
PDF Views:1
Authors
Affiliations
1 Bharathiar University, Coimbatore, IN
2 Department of Computer Science, SNS Raja Lakshmi College of Arts and Science, Coimbatore, IN
1 Bharathiar University, Coimbatore, IN
2 Department of Computer Science, SNS Raja Lakshmi College of Arts and Science, Coimbatore, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 4 (2011), Pagination: 207-210Abstract
Data mining resides in the junction of traditional statistics and computer science. As distinct from statistics, data mining is more about searching for hypotheses in data that happens to be available instead of verifying research hypotheses by collecting data from designed experiments. Data mining is also characterized as being oriented toward problems with a large number of variables and/or samples that makes scaling up algorithms important. This means developing algorithms with low computational complexity, using parallel computing, partitioning the data into subsets, or finding effective ways to use relational data bases. The process- and utility-centered thinking in data mining and knowledge discovery is manifested also in the reported, commercial systems. Decision Trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. The technology for building Knowledge based system by decision tree algorithms has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in variety of systems, and it describes such system ID3, C4.5 and CART. Results from recent studies show ways in which the methodology can be modified to deal with information that is noisy and/or incomplete.Keywords
Decision Tree, ID3, C4.5 and CART.- Data Mining Techniques on Social Media Drug Related Posts-A Comparative Study and Analysis
Abstract Views :222 |
PDF Views:1
Authors
Affiliations
1 Department of Computer Science (PG), PSGR Krishnammal College for Women, Coimbatore, IN
2 Department of Computer Science, Dr. N.G.P College of Arts and Science, Coimbatore, IN
1 Department of Computer Science (PG), PSGR Krishnammal College for Women, Coimbatore, IN
2 Department of Computer Science, Dr. N.G.P College of Arts and Science, Coimbatore, IN