Adrian Santos and Natalia Juristo. 2018. Comparing techniques for aggregating interrelated replications in software engineering. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '18). ACM, New York, NY, USA, Article 8, 10 pages. DOI: https://doi.org/10.1145/3239235.3239239
Comparing techniques for aggregating interrelated replications in software engineering
|Author:||Santos, Adrian1; Juristo, Natalia2|
1M3S-ITEE University of Oulu, Finland
2Universidad Politécnica de Madrid, Spain
|Online Access:||PDF Full Text (PDF, 0.6 MB)|
|Persistent link:|| http://urn.fi/urn:nbn:fi-fe2018121350714
Association for Computing Machinery,
|Publish Date:|| 2018-12-13
Context: Researchers from different groups and institutions are collaborating towards the construction of groups of interrelated replications. Applying unsuitable techniques to aggregate interrelated replications’ results may impact the reliability of joint conclusions.
Objectives: Comparing the advantages and disadvantages of the techniques applied to aggregate interrelated replications’ results in Software Engineering (SE).
Method: We conducted a literature review to identify the techniques applied to aggregate interrelated replications’ results in SE. We analyze a prototypical group of interrelated replications in SE with the techniques that we identified. We check whether the advantages and disadvantages of each technique—according to mature experimental disciplines such as medicine—materialize in the SE context.
Results: Narrative synthesis and Aggregation of p-values do not take advantage of all the information contained within the raw-data for providing joint conclusions. Aggregated Data (AD) meta-analysis provides visual summaries of results and allows assessing experiment-level moderators. Individual Participant Data (IPD) meta-analysis allows interpreting results in natural units and assessing experiment-level and participant-level moderators.
Conclusion: All the information contained within the raw-data should be used to provide joint conclusions. AD and IPD, when used in tandem, seem suitable to analyze groups of interrelated replications in SE.
Proceeding ESEM '18 Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
International symposium on empirical software engineering and measurement
|Type of Publication:||
A4 Article in conference proceedings
|Field of Science:||
113 Computer and information sciences
This research was developed with the support of the Spanish Ministry of Science and Innovation project TIN2014-60490-P.
© 2018 Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '18), https://doi.org/10.1145/3239235.3239239.