PANDORA : continuous mining software repository and dataset generation |
|
Author: | Nguyen, Hung1; Lomio, Francesco2; Pecorelli, Fabiano2; |
Organizations: |
1Aalto University, Helsinki, Finland 2Tampere University, Tampere, Finland 3University of Oulu, Oulu, Finland |
Format: | article |
Version: | accepted version |
Access: | open |
Online Access: | PDF Full Text (PDF, 1.1 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe2023032333017 |
Language: | English |
Published: |
Institute of Electrical and Electronic Engineers,
2022
|
Publish Date: | 2023-03-23 |
Description: |
AbstractDuring the mining software repository activities, a huge amount of data gathered from different sources is analyzed. Different tools have been developed for collecting and aggregating data from repositories, but they do not easily allow researchers to develop new extractors, to integrate the data collected from other platforms, and in particular from platforms that delete the data periodically. Moreover, mining software repository studies are commonly performed on old versions of software projects and their results are not commonly periodically updated. As a result of the non-continuously updated studies, practitioners often do not trust results from empirical studies. In order to overcome the aforementioned issues, in this paper, we present Pandora, a tool that automatically and continuously mines data from different existing tools and online platforms and enables to run and continuously update the results of mining software repository studies. To evaluate the applicability of our tool, we currently analyzed 365 projects (developed in different languages), continuously collecting data from December 2020 to May 2021 and running an example study, investigating the build-stability of SonarQube rules. see all
|
Series: |
IEEE International Conference on Software Analysis, Evolution and Reengineering |
ISSN: | 1534-5351 |
ISSN-E: | 2640-7574 |
ISSN-L: | 1534-5351 |
ISBN: | 978-1-6654-3786-8 |
ISBN Print: | 978-1-6654-3787-5 |
Pages: | 263 - 267 |
DOI: | 10.1109/saner53432.2022.00041 |
OADOI: | https://oadoi.org/10.1109/saner53432.2022.00041 |
Host publication: |
2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) |
Conference: |
IEEE International Conference on Software Analysis, Evolution and Reengineering |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
113 Computer and information sciences |
Subjects: | |
Copyright information: |
© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |