A systematic literature review and meta-analysis on cross project defect prediction

Hosseini, Seyedrebvar; Turhan, Burak; Gunarathna, Dimuthu

A systematic literature review and meta-analysis on cross project defect prediction

Hosseini, Seyedrebvar; Turhan, Burak; Gunarathna, Dimuthu (2019-02-01)

Avaa tiedosto

nbnfi-fe2019092329446.pdf (1.890Mt)

nbnfi-fe2019092329446_meta.xml (29.55Kt)

nbnfi-fe2019092329446_solr.xml (32.17Kt)

Lataukset:

URL:

https://doi.org/10.1109/TSE.2017.2770124

Hosseini, Seyedrebvar

Turhan, Burak

Gunarathna, Dimuthu

Institute of Electrical and Electronics Engineers

01.02.2019

S. Hosseini, B. Turhan and D. Gunarathna, "A Systematic Literature Review and Meta-Analysis on Cross Project Defect Prediction," in IEEE Transactions on Software Engineering, vol. 45, no. 2, pp. 111-147, 1 Feb. 2019. doi: 10.1109/TSE.2017.2770124

https://rightsstatements.org/vocab/InC/1.0/
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/

doi:https://doi.org/10.1109/TSE.2017.2770124

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2019092329446

Tiivistelmä

Abstract

Background: Cross project defect prediction (CPDP) recently gained considerable attention, yet there are no systematic efforts to analyse existing empirical evidence.

Objective: To synthesise literature to understand the state-of-the-art in CPDP with respect to metrics, models, data approaches, datasets and associated performances. Further, we aim to assess the performance of CPDP versus within project DP models.

Method: We conducted a systematic literature review. Results from primary studies are synthesised (thematic, meta-analysis) to answer research questions.

Results: We identified 30 primary studies passing quality assessment. Performance measures, except precision, vary with the choice of metrics. Recall, precision, f-measure, and AUC are the most common measures. Models based on Nearest-Neighbour and Decision Tree tend to perform well in CPDP, whereas the popular naïve Bayes yields average performance. Performance of ensembles varies greatly across f-measure and AUC. Data approaches address CPDP challenges using row/column processing, which improve CPDP in terms of recall at the cost of precision. This is observed in multiple occasions including the meta-analysis of CPDP versus WPDP. NASA and Jureczko datasets seem to favour CPDP over WPDP more frequently. Conclusion: CPDP is still a challenge and requires more research before trustworthy applications can take place. We provide guidelines for further research.

Kokoelmat

Avoin saatavuus [31928]