Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

CLUBA:A Clustering-Based Approach for Bug Assignment


Affiliations
1 College of Computer & Information Sciences, Prince Sultan University, Saudi Arabia
2 Computer Science and Software Engineering, University of Detroit Mercy, United States
     

   Subscribe/Renew Journal


Nowadays, software systems are very complex which make software maintenance, especially bug fixing, very challenging. Identifying an appropriate developer to handle a new reported bug is very difficult and error-prone which results in a lengthy bug fixing process. In this paper, we propose a new developer recommendation approach, CLUBA, for assigning relevant developers to fix new bugs. The approach is based on clustering and it recommends a varying list of candidate developers based on their experience. The effect of choosing different percentages of terms from the corpus on clustering quality is carefully evaluated by applying one of the best feature selection methods. Then, similar bug reports are grouped together using the K-means clustering technique. After that, developers are ranked in each cluster based on their experience. We have validated CLUBA on four real open source projects and showed the feasibility of the approach by experimental evaluation.

Keywords

Bug Assignment, Developer Recommendation, Mining Bug Repositories, Software Maintenance.
Subscription Login to verify subscription
User
Notifications
Font Size


  • A. A. Al-Subaihin, F. Sarro, S. Black, L. Capra, M. Harman, Y. Jia, and Y. Zhang, “Clustering mobile apps based on mined textual features,” In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, p. 38, ACM, 2016.
  • J. Anvik, “Automating bug report assignment,” In Proceedings of the 28th International Conference on Software Engineering, pp. 937-940, ACM, 2006.
  • J. Anvik, L. Hiew, and G. C. Murphy, “Who should fix this bug?,” In Proceedings of the 28th International Conference on Software Engineering, pp. 361-370, ACM, 2006.
  • J. Anvik, and G. C. Murphy, “Determining implementation expertise from bug reports,” In Fourth International Workshop on Mining Software Repositories (MSR’07: ICSE Workshops, 2007), pp. 2-2, IEEE, 2007.
  • J. Anvik, L. Hiew, and G. C. Murphy, “Coping with an open bug repository,” In Proceedings of the 2005 OOPSLA Workshop on Eclipse Technology eXchange, pp. 35-39, ACM, 2005.
  • J. Anvik, and G. C. Murphy, “Reducing the effort of bug report triage: Recommenders for development-oriented decisions,” ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 20, no. 3, 2011.
  • S. Banitaan, and M. Alenezi, “Tram: An approach for assigning bug reports using their metadata,” In 2013 Third International Conference on Communications and Information Technology (ICCIT), pp. 215-219, IEEE, 2013.
  • P. Bhattacharya, and I. Neamtiu, “Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging,” In 2010 IEEE International Conference on Software Maintenance (ICSM), pp. 1-10, IEEE, 2010.
  • P. Bhattacharya, I. Neamtiu, and C. R. Shelton, “Automated, highly-accurate, bug assignment using machine learning and tossing graphs,” Journal of Systems and Software, vol. 85, no. 10, pp. 2275- 2292, 2012.
  • D. Cubrani´c, and G. C. Murphy, “Automatic bug triage using text categorization,” In SEKE 2004: Proceedings of the Sixteenth International Conference on Software Engineering, pp. 92-97, Citeseer, 2004.
  • I. S. Dhillon, “Co-clustering documents and words using bipartite spectral graph partitioning,” In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269-274, ACM, 2001.
  • N. Friedman, I. Nachman, and D. Pe´er, “Learning bayesian network structure from massive datasets: The sparse candidate algorithm,” In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 206-215, Morgan Kaufmann Publishers Inc., 1999.
  • J. Ghosh, and A. Strehl, “Similarity-based text clustering: A comparative study,” Grouping Multidimensional Data, pp. 73-97, Springer, Berlin, Heidelberg, 2006.
  • A. Huang, “Similarity measures for text document clustering,” In Proceedings of the Sixth New Zealand Computer Science Research Student Conference (NZCSRSC2008), pp. 49-56, Christchurch, New Zealand,
  • A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” ACM Computing Surveys (CSUR), vol. 31, no. 3, pp. 264-323, 1999.
  • G. Jeong, S. Kim, and T. Zimmermann, “Improving bug triage with bug tossing graphs,” In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 111-120, ACM, 2009.
  • P. Bhattacharya, I. Neamtiu, and C. R. Shelton, “Automated, highly-accurate, bug assignment using machine learning and tossing graphs,” Journal of Systems and Software, vol. 85, no. 10, pp. 2275-2292, 2012.
  • D. Kempe, J. M. Kleinberg, and E. Tardos, “Maximizing the spread of influence through a social network,” Theory of Computing, vol. 11, no. 4, pp. 105-147, 2015.
  • A. J. Ko, R. DeLine, and G. Venolia, “Information needs in collocated software development teams,” In Proceedings of the 29th International Conference on Software Engineering, pp. 344-353, IEEE Computer Society, 2007.
  • A. J. Ko, B. A. Myers, and D. H. Chau, “A linguistic analysis of how people describe software problems,” In Proceedings of the Visual Languages and Human-Centric Computing (VLHCC’06), pp. 127-134, IEEE Computer Society, Washington, DC, USA, 2006.
  • Y. Li, C. Luo, and S. M. Chung, “Text clustering with feature selection by using statistical data,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 5, pp. 641-652, 2008.
  • L. Liu, J. Kang, J. Yu, and Z. Wang, “A comparative study on unsupervised feature selection methods for text clustering,” In 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLPKE’05), pp. 597-601, IEEE, 2005.
  • T. Liu, S. Liu, Z. Chen, and W. Y. Ma, “An evaluation on feature selection for text clustering,” In Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 488-495, 2003.
  • C. Luo, Y. Li, and S. M. Chung, “Text document clustering based on neighbors,” Data and Knowledge Engineering, vol. 68, no. 11, pp. 1271-1288, 2009.
  • C. D. Manning, P. Raghavan, and H. Schu¨tze, Introduction to Information Retrieval, vol. 1, Cambridge University Press, Cambridge, 2008.
  • D. Matter, A. Kuhn, and O. Nierstrasz, “Assigning bug reports using a vocabulary-based expertise model of developers,” In 6th IEEE International Working Conference on Mining Software Repositories (MSR’09), pp. 131-140, IEEE, 2009.
  • H. Naguib, N. Narayan, B. Bru¨gge, and D. Helal, “Bug report assignee recommendation using activity profiles,” In 10th IEEE Working Conference on Mining Software Repositories (MSR’13), pp. 22-30, IEEE, 2013.
  • F. Servant, and J. A. Jones, “Whosefault: Automatic developer-to-fault assignment through fault localization,” In Proceedings of the 34th International Conference on Software Engineering, pp. 36-46, IEEE Press, 2012.
  • M. Steinbach, G. Karypis, and V. Kumar, “A comparison of document clustering techniques,” In KDD Workshop on Text Mining, vol. 400, pp. 525-526, Boston, 2000.
  • A. Tamrawi, T. T. Nguyen, J. M. Al-Kofahi, and T. N. Nguyen, “Fuzzy set and cache-based approach for bug triaging,” In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, pp. 365-375, ACM, 2011.
  • W. Wu, W. Zhang, Y. Yang, and Q. Wang, “Drex: Developer recommendation with k-nearest-neighbor search and expertise ranking,” In 2011 18th Asia-Pacific Software Engineering Conference (APSEC), pp. 389-396, IEEE, 2011.
  • X. Xia, D. Lo, Y. Ding, J. M. Al-Kofahi, T. N. Nguyen, and X. Wang, “Improving automated bug triaging with specialized topic model,” IEEE Transactions on Software Engineering, vol. 43, no. 3, pp. 272-297, 2017.
  • X. Xia, D. Lo, X. Wang, and B. Zhou, “Accurate developer recommendation for bug resolution,” In 2013 20th Working Conference on Reverse Engineering (WCRE), pp. 72-81, IEEE, 2013.
  • X. Xie, W. Zhang, Y. Yang, and Q. Wang, “Dretom: Developer recommendation based on topic models for bug resolution,” In Proceedings of the 8th International Conference on Predictive Models in Software Engineering, pp. 19-28, ACM, 2012.
  • J. Xuan, H. Jiang, Z. Ren, J. Yan, and Z. Luo, “Automatic bug triage using semi-supervised text classification,” In Proceedings of International Conference on Software Engineering and Knowledge Engineering (SEKE’10), pp. 209-214, 2010.
  • J. Xuan, H. Jiang, Z. Ren, and W. Zou, “Developer prioritization in bug repositories,” In 2012 34th International Conference on Software Engineering (ICSE), pp. 25-35, IEEE, 2012.
  • T. Zhang, and B. Lee, “How to recommend appropriate developers for bug fixing?,” In 2012 IEEE 36th Annual Computer Software and Applications Conference (COMPSAC), pp. 170-175, IEEE, 2012.
  • T. Zhang, H. Jiang, X. Luo, and A. T. S. Chan, “A literature review of research in bug resolution: Tasks, challenges and future directions,” The Computer Journal, vol. 59, no. 5, pp. 741-773, 2016.
  • T. Zhang, G. Yang, B. Lee, and E. K. Lua, “A novel developer ranking algorithm for automatic bug triage using topic model and developer relations,” In 2014 21st Asia-Pacific Software Engineering Conference (APSEC), vol. 1, pp. 223-230, IEEE, 2014.
  • W. Zhang, Y. Cui, and T. Yoshida, “En-lda: An novel approach to automatic bug report assignment with entropy optimized latent dirichlet allocation,” Entropy, vol. 19, no. 5, p. 173, 2017.
  • Y. Zhao, and G. Karypis, “Empirical and theoretical comparisons of selected criterion functions for document clustering,” Machine Learning, vol. 55, no. 3, pp. 311-331, 2004.

Abstract Views: 249

PDF Views: 1




  • CLUBA:A Clustering-Based Approach for Bug Assignment

Abstract Views: 249  |  PDF Views: 1

Authors

Mamdouh Alenezi
College of Computer & Information Sciences, Prince Sultan University, Saudi Arabia
Shadi Banitaan
Computer Science and Software Engineering, University of Detroit Mercy, United States
Mohammad Zarour
College of Computer & Information Sciences, Prince Sultan University, Saudi Arabia

Abstract


Nowadays, software systems are very complex which make software maintenance, especially bug fixing, very challenging. Identifying an appropriate developer to handle a new reported bug is very difficult and error-prone which results in a lengthy bug fixing process. In this paper, we propose a new developer recommendation approach, CLUBA, for assigning relevant developers to fix new bugs. The approach is based on clustering and it recommends a varying list of candidate developers based on their experience. The effect of choosing different percentages of terms from the corpus on clustering quality is carefully evaluated by applying one of the best feature selection methods. Then, similar bug reports are grouped together using the K-means clustering technique. After that, developers are ranked in each cluster based on their experience. We have validated CLUBA on four real open source projects and showed the feasibility of the approach by experimental evaluation.

Keywords


Bug Assignment, Developer Recommendation, Mining Bug Repositories, Software Maintenance.

References