Open Access Open Access  Restricted Access Subscription Access

Data Mining Driven Models for Diagnosis of Diabetes Mellitus:A Survey


Affiliations
1 Department of Mathematics and Computer Science, Federal University of Kashere, Nigeria
 

Objective: In this study, a systematic effort was employed to identify and review data mining concept, tasks and model evaluation techniques, Knowledge Discovery and Data mining process Model (KDDM) model process and research articles published with reputable journal publishers that employed data mining techniques for diagnosis of Diabetes Mellitus. Method/Analysis: The findings from this work have been drawn from the published articles reviewed and the frequency analysis was used for the analysis of the reviewed works. Finding: The result of the study showed that, classification data mining task has been the most successfully and most frequently used data mining tasks for diagnosis of DM and the mostly commonly used classification data mining algorithms are Support Vector Machine and decision tree algorithms. Novelty/Improvement: In the study Support Vector Machine was realized to be most efficient data mining algorithm for diagnosis of Diabetes Mellitus using either clinical or biological and clinical dataset of Diabetes Mellitus. Despite its popularity, SVM algorithm should be further improved in the future work so as to further improve its efficiency.
User

  • Muhammad LJ,Sani S, Yakubu A, Yusuf MM, Elrufai TA, Mohammed IA, Nuhu AM. Using decision tree data mining algorithm to predict causes of road traffic accidents, its prone locations and time along Kano –Wudil Highway, International Journal of Database Theory and Application. 2016; 10(2):197−206.
  • Ha S,BaeS, Park S. Web Mining for Distance Education, Proceeding of IEEE International Conference on Management of Innovation and Technology; 2000. p. 715–19.
  • Liao S, Chu P, Hsiao P. Data mining techniques and applications – A decade review from 2000 to 2011, Expert Systems with Applications, Elsevier. 2012; 39:11303–11. https://doi.org/10.7312/li--16274-040.
  • Tomar D, Agarwal S. Survey on data mining approaches for healthcare, SERSC, International Journal of Bio-Science and Bio-Technology. 2013; 5(5):241−66. https://doi.org/10.14257/ijbsbt.2013.5.5.25.
  • Perveen S, Shahbaz M, Guergachi A, Keshavjee K. Performance analysis of data mining classification techniques to predict diabetes, Procedia Computer Science. 2016; 82:115– 21. https://doi.org/10.1016/j.procs.2016.04.016.
  • Padawale SN, Jadhav BD. Survey on the various techniques used for the diagnosis of diabetes mellitus, IOSR, Journal of Electronics and Communication Engineering. 2015; 25−29.
  • Aljumah AA, Ahamad MG, Siddiqui MK. Application of data mining: Diabetes health care in young and old patients, Journal of King Saud University – Computer and Information Sciences, Elsevier. 2013; 25:127–36.
  • Ogbera AO, Ekpebegh C. Diabetes mellitus in Nigeria: The past, present and future, World Journal of Diabetes. 2014; 5(6):905−11. https://doi.org/10.4239/wjd.v5.i6.905. PMid: 25512795, PMCid: PMC4265879.
  • Shivakumar BL, Alby S. A Survey on Data-Mining Technologies for Prediction and Diagnosis of Diabetes. International Conference on Intelligent Computing Applications; 2014. p. 163−73. https://doi.org/10.1109/ICICA.2014.44.
  • Kurgan LA, Musilek PA. Survey of knowledge discovery and data mining process model, Cambridge University Press, the Knowledge Engineering Review. 2006; 21(1):1– 24.
  • Fayyad U, Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery in databases, Artificial Intelligence Magazine. 1996; 17:37–54.
  • Fayyad G, Piatesky-Shapiro G, Smyth P, Uthurusamy R. Advances in Knowledge Discovery and Data Mining, Cambridge, AAAI Press; 1996.
  • Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, ChouvardaI. Machine learning and data mining methods in diabetes research, Computer Structure and Biotechnology Journal. 2017; 15(1):104–16. https://doi.org/10.1016/j.csbj.2016.12.005. PMid: 28138367, PMCid: PMC5257026.
  • Cabena P, Hadjinian P, Stadler R, Verhees J, Zanasi A. Discovering Data Mining: From Concepts to Implementation. Prentice Hall Saddle River, New Jersey; 1998.
  • Luo Q. Advancing Knowledge Discovery and Data Mining. IEEE, Proceeding of Knowledge Discovery and Data Mining, Australia; 2008. p. 3−5.
  • Wasan SK, Bhatnagar V, Kaur H. The impact of data mining techniques on medical diagnostics, Data Science Journal. 2006; 5:119–26. https://doi.org/10.2481/dsj.5.119.
  • Freitas AA. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Berlin: Springer-Verlag; 2002. https://doi.org/10.1007/978-3-662-04923-5.
  • Ramageri RM. Data mining techniques and applications, Indian Journal of Computer Science and Engineering. 2016; 4:301−05.
  • Chi Y, Liu X, Xia K, Su C. An Intelligent diagnosis to type-2 diabetes based on QPSO algorithm and WLSSVM, Intelligent Information Technology Application Workshops. 2018; 117–21. PMid: 29469163, PMCid: PMC5865475.
  • Sagar P, Prinima, Indu. Analysis of prediction techniques based on classification and regression, International Journal of Computer Applications. 2017; 163(7):47–51.
  • Nilashi M, Ibrahim O, Dalvi M, Ahmadi H, Shahmoradi L. Accuracy improvement for diabetes disease classification: A case on a public medical dataset, Fuzzy Information and Engineering. 2017; 9(3):345−57. https://doi.org/10.1016/j.fiae.2017.09.006.
  • Priya S, Rajalaxmi RR. An improved data mining model to predict the occurrence of type- 2 diabetes using Neural Network, International Journal of Computer Applications® (IJCA). 2012; 1−4.
  • Yang H, Huang S, Wang JX. Type-2 diabetes mellitus prediction model based on data mining, Informatics in Medicine Unlocked, Springer. 2018; 10(1):100–07.
  • Kasemthaweesab P, Kurutach W. Association analysis of Diabetes Mellitus (DM) with Complication States Based on Association Rules. Proceeding of the 7th IEEE Conference on Industrial Electronics and Applications; 2012. p. 1453−57. https://doi.org/10.1109/ICIEA.2012.6360952.
  • Patil BM, Joshi RC, Toshniwal D. Association Rule for Classification of Type-2 Diabetic Patients. Proceeding of Machine Learning and Computing (ICMLC) Second International Conference on Machine Learning and Computing; 2010. p. 330−34. https://doi.org/10.1109/ICMLC.2010.67. PMid: 20216911, PMCid: PMC2831780.
  • Yongjian F. Data Mining: Tasks, Techniques and Applications, IEEE Potentials. 1997; 16(4):1−12. https:// doi.org/10.1109/45.624335.
  • Van MV, Vreeken J, Siebes A. Compression picks the item sets that matter. In: Proceedings of the ECML PKDD’06; 2006. p. 585–92.
  • Zhang GP. Neural Networks for Data Mining. In: Maimon O., Rokach L. (Eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA; 2009. p. 1−47. https://doi.org/10.1007/978-0-387-09823-4_21.
  • Craven M. Shavlik J. Using neural networks for data mining, Future Generation Computer Systems. 1997; 13:211−29. https://doi.org/10.1016/S0167-739X(97)00022-8.
  • Alade AO, Sowunmi OY, Misra S, Maskeliūnas R, Damaševičius R. A Neural Network Based Expert System for the Diagnosis of Diabetes Mellitus. International Conference on Information Technology Science; 2017. p. 14−22.
  • American Diabetes Association. Diagnosis and classification of diabetes mellitus, Diabetes Care. 2008; 31(1):55−60.
  • American Diabetes Association. Standards of medical care in diabetes, Diabetes Care. 2008 January; 31(1):12−54.
  • Georga EI, Protopappas VC, Fotiadis DI. Glucose Prediction in Type-1 and Type-2 Diabetic Patients Using Data Driven Techniques. In: Funatsu, Ed., KnowledgeOriented Applications in Data Mining; 2011. p. 1−22.
  • Yuan CZ, Isa D, Blanchfield P. A Hybrid data mining and case-based reasoning user modeling system (HDCU) for monitoring and predicting of blood sugar level. Proceeding in International Conference of Computer Science and Software Engineering; 2008 1. p. 653−56. https://doi.org/10.1109/CSSE.2008.1095.
  • Ilango BS, Ramaraj NA. Hybrid prediction model with F-score feature selection for type II Diabetes databases. A2CWiC ‘10 Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India; 2010.
  • Kandhasamy JP, Balamurali S. Performance analysis of classifier models to predict diabetes mellitus, Elsevier, Procedia Computer Science. 2015; 47:45–51. https://doi.org/10.1016/j.procs.2015.03.182.
  • Meng X, Huang Y, Rao D, Zhang Q, Liu Q. Comparison of three data mining models for predicting diabetes or prediabetes by risk factors, Kaohsiung Journal of Medical Sciences, Elsevier Taiwan. 2013; 29(2):93−99.
  • Yu W, Liu T, Valdez R, Gwinn M, Khoury MJ. Application of support vector machine modeling for prediction of common diseases: The case of diabetes and pre-diabetes, BMC Medical Informatics and Decision Making. 2010; 10(1):16−10 https://doi.org/10.1186/1472-6947-10-16. PMid: 20307319, PMCid: PMC2850872.

Abstract Views: 221

PDF Views: 0




  • Data Mining Driven Models for Diagnosis of Diabetes Mellitus:A Survey

Abstract Views: 221  |  PDF Views: 0

Authors

F. S. Ishaq
Department of Mathematics and Computer Science, Federal University of Kashere, Nigeria
L. J. Muhammad
Department of Mathematics and Computer Science, Federal University of Kashere, Nigeria
B. Z. Yahaya
Department of Mathematics and Computer Science, Federal University of Kashere, Nigeria
Y. Atomsa
Department of Mathematics and Computer Science, Federal University of Kashere, Nigeria

Abstract


Objective: In this study, a systematic effort was employed to identify and review data mining concept, tasks and model evaluation techniques, Knowledge Discovery and Data mining process Model (KDDM) model process and research articles published with reputable journal publishers that employed data mining techniques for diagnosis of Diabetes Mellitus. Method/Analysis: The findings from this work have been drawn from the published articles reviewed and the frequency analysis was used for the analysis of the reviewed works. Finding: The result of the study showed that, classification data mining task has been the most successfully and most frequently used data mining tasks for diagnosis of DM and the mostly commonly used classification data mining algorithms are Support Vector Machine and decision tree algorithms. Novelty/Improvement: In the study Support Vector Machine was realized to be most efficient data mining algorithm for diagnosis of Diabetes Mellitus using either clinical or biological and clinical dataset of Diabetes Mellitus. Despite its popularity, SVM algorithm should be further improved in the future work so as to further improve its efficiency.

References





DOI: https://doi.org/10.17485/ijst%2F2018%2Fv11i42%2F132665