Open Access Open Access  Restricted Access Subscription Access

Comparative Analysis of Tanagra and R Data Mining Tool for Diabetic Diagnosis Using K Mean Clustering and Genetic Algorithm by Integrating with S.V.M


Affiliations
1 Computer Engineering Department, Punjabi University, Patiala, India
 

Diabetes mellitus is a chronic disease and a major public health challenge worldwide. According to the International Diabetes Federation, there are currently 246 million diabetic people worldwide, and this number is expected to rise to 380 million by 2025. Diabetes is a standout amongst the most well -known non- transmittable diseases in the world .Vast amount of data available in health care industry is difficult to handle, hence mining is necessary to find the necessary pattern and relationship among the features available. Medical data mining is one major research area where evolutionary algorithms and clustering algorithms play a vital role . Several data mining and machine learning methods have been used for the diagnosis, prognosis, and management of diabetes. Several researchers are using statistical and data mining tools like rapid miner ,weka , KNIME etc. to help health care professionals in the diagnosis of diabetes. The data source for this research is taken from UCI repository. Various experiments are made iteratively by using various techniques on Tanagra and R tool. In this research work, K-Means is used for removing the noisy data and genetic algorithms for finding the optimal set of features with Support Vector Machine (SVM) as classifier for classification. It shows that the proposed method using Tanagra with an accuracy of 76.44 % has attained better results compared to R Tool with an accuracy of 75.39 %.
User
Notifications
Font Size

Abstract Views: 122

PDF Views: 0




  • Comparative Analysis of Tanagra and R Data Mining Tool for Diabetic Diagnosis Using K Mean Clustering and Genetic Algorithm by Integrating with S.V.M

Abstract Views: 122  |  PDF Views: 0

Authors

Ramanpreet Kaur
Computer Engineering Department, Punjabi University, Patiala, India
Gurpreet Singh
Computer Engineering Department, Punjabi University, Patiala, India

Abstract


Diabetes mellitus is a chronic disease and a major public health challenge worldwide. According to the International Diabetes Federation, there are currently 246 million diabetic people worldwide, and this number is expected to rise to 380 million by 2025. Diabetes is a standout amongst the most well -known non- transmittable diseases in the world .Vast amount of data available in health care industry is difficult to handle, hence mining is necessary to find the necessary pattern and relationship among the features available. Medical data mining is one major research area where evolutionary algorithms and clustering algorithms play a vital role . Several data mining and machine learning methods have been used for the diagnosis, prognosis, and management of diabetes. Several researchers are using statistical and data mining tools like rapid miner ,weka , KNIME etc. to help health care professionals in the diagnosis of diabetes. The data source for this research is taken from UCI repository. Various experiments are made iteratively by using various techniques on Tanagra and R tool. In this research work, K-Means is used for removing the noisy data and genetic algorithms for finding the optimal set of features with Support Vector Machine (SVM) as classifier for classification. It shows that the proposed method using Tanagra with an accuracy of 76.44 % has attained better results compared to R Tool with an accuracy of 75.39 %.