Open Access
Subscription Access
Rough Set Theory Based Attribute Reduction for Breast Cancer Diagnosis
Data mining (DM) techniques are used to determine interesting patterns from different domains according to the need of applications and the analyst. Medical field is one among the major user of the mining technology for diagnosing the attributes for the medical issues. Breast cancer is one of the most important medical problems. The modern researchers and technological advancements attempted to determine the cause and prevention in an effective manner with lesser number of attributes. But the diagnosis is lengthy process with multiple and multilevel attribute analysis in certain cases. In order to improve the accuracy of diagnosis with limited attributes, in this paper rough set based relative reduct algorithm is used to reduce the number of attributes using equivalence relation. The effectiveness of proposed Rough Set Reduction algorithm is analyzed on Wisconsin Breast Cancer Dataset (WBCD) and presented as a part of the paper. The experimental results show that the relative reduct performs better attribute reduction.
Keywords
Data mining, Data Preprocessing, Rough Set, Data reduction, Breast Cancer Diagnosis
User
Information
- Agrawal R, Imielinski T and Swami A (1993) Database mining: A performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925.
- Blake CL and Merz CJ (1998) UCI Repository of machine learning databases, Irvine, University of California, http://www.ics.uci.edu/~mlearn/
- Dash M and Liu H (1997) Feature Selection for Classification. Intell. Data anal. 1(3), 131-156.
- Fayyad UM, Piatetsky-Shapiro G, Smyth P and Uthurusamy R (1996) Advances in Knowledge Discovery and Data Mining, pages 495–515. AAAI Press / the MIT Press.
- Guyon I and Elissee A (2003) An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157-1182.
- Han J, Hu X and Lin TY (2004) Feature subset selection based on relative dependency between attributes, in Proc. of the 4th International Conf. on Rough sets and Current Trend in Computing, Uppsala, pp. 176–185.
- Jensen R and Shen Q (2001) A Rough Set-Aided System for Sorting WWW Bookmarks, In Zhong N et al. (Eds.), Web Intelligence: Research and Development, pp. 95- 105.
- Jensen R (2004) Combining rough and fuzzy sets for feature selection, Ph.D thesis, University of Edinburgh.
- Liu H and Motoda H (1998) Feature Selection for Knowledge Discovery and Data Mining. Boston: Kluwer Academic Publishers.
- Mandelbrot BB (1965) Linear and nonlinear separation of patterns by linear programming. Oper. Res. 13, 444- 452.
- Qiang Shen and Alexios Chouchoulas (2000) A modular approach to generating fuzzy rules with reduced attributes for the monitoring of complex systems. Eng. Appl. Artif. Intell. 13(3), 263–278.
- Quinlan, JR (1993) C4.5: Programs for Machine Learning, The Morgan Kaufmann Series in Machine earning. Morgan Kaufmann Publishers, San Mateo, CA.
- Street W, Wolberg W and Mangasarian O (1993) Nuclear feature extraction for breast tumor diagnosis. Available from: citeseer.ist.psu.edu/street93nuclear.html.
- Zdzislaw Pawlak (1982) Rough sets. Int. J. Compu.Info. Sci. 11, 341-356.
- Zdzislaw Pawlak (1991) Rough Sets-Theoretical Aspects and Reasoning about Data, Kluwer Academic Publications.
Abstract Views: 359
PDF Views: 75