University of Oulu

Md Alamgir Kabir, Jacky Keung, Burak Turhan, Kwabena Ebo Bennin, Inter-release defect prediction with feature selection using temporal chunk-based learning: An empirical study, Applied Soft Computing, Volume 113, Part A, 2021, 107870, ISSN 1568-4946, https://doi.org/10.1016/j.asoc.2021.107870

Inter-release defect prediction with feature selection using temporal chunk-based learning : an empirical study

Saved in:
Author: Kabir, Alamgir1; Keung, Jacky1; Turhan, Burak2,3;
Organizations: 1Department of Computer Science, City University of Hong Kong, Hong Kong
2University of Oulu, Finland
3Monash University, Australia
4Information Technology Group, Wageningen University and Research, Wageningen, The Netherlands
Format: article
Version: accepted version
Access: embargoed
Persistent link: http://urn.fi/urn:nbn:fi-fe2021100649449
Language: English
Published: Elsevier, 2021
Publish Date: 2023-09-09
Description:

Abstract

Inter-release defect prediction (IRDP) is a practical scenario that employs the datasets of the previous release to build a prediction model and predicts defects for the current release within the same software project. A practical software project experiences several releases where data of each release appears in the form of chunks that arrive in temporal order. The evolving data of each release introduces new concept to the model known as concept drift, which negatively impacts the performance of IRDP models. In this study, we aim to examine and assess the impact of feature selection (FS) on the performance of IRDP models and the robustness of the model to concept drift. We conduct empirical experiments using 36 releases of 10 open-source projects. The Friedman and Nemenyi Post-hoc test results indicate that there were statistical differences between the prediction results with and without FS techniques. IRDP models trained on the data of most recent releases were not always the best models. Furthermore, the prediction models trained with carefully selected features could help reduce concept drifts.

see all

Series: Applied soft computing
ISSN: 1568-4946
ISSN-E: 1872-9681
ISSN-L: 1568-4946
Volume: 113
Article number: 107870
DOI: 10.1016/j.asoc.2021.107870
OADOI: https://oadoi.org/10.1016/j.asoc.2021.107870
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Subjects:
Funding: This work is supported in part by the General Research Fund of the Research Grants Council of Hong Kong (No.11208017) and the research funds of City University of Hong Kong (7005028, 7005217), and the Research Support Fund by Intel (9220097), and funding supports from other industry partners (9678149, 9440227, 9440180 and 9220103).
Copyright information: © 2021 Elsevier B.V. All rights reserved. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.
  https://creativecommons.org/licenses/by-nc-nd/4.0/