The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


Sampling is an important and necessary step in mining large size databases and is also very useful in performing mining operations, where performance is a critical issue. This study focuses on identifying the effect of sample size in classification of software bugs. To analyze the effect of sample size, experiments are performed using a number of classification algorithms with varities of sample sizes using the software bug repositories of three large open source software's namely Android, Mozilla and MySql. The relationship between the sample size with two primary classification performance parameters accuracy and F-measure is explored in this study. From experiments, it is identified that the parameter F-measure is affected more by the sample size than accuracy.

Keywords

Sampling, Sample Size, Classification, Software Bug, Performance, Classifier Evaluation
User