20-MAD : 20 years of issues and commits of Mozilla and Apache development |
|
Author: | Claes, Maëlick1; Mäntylä, Mika V.1 |
Organizations: |
1M3S, ITEE, University of Oulu, Finland |
Format: | article |
Version: | accepted version |
Access: | open |
Online Access: | PDF Full Text (PDF, 0.1 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe20201211100361 |
Language: | English |
Published: |
Association for Computing Machinery,
2020
|
Publish Date: | 2020-12-11 |
Description: |
AbstractData of long-lived and high profile projects is valuable for research on successful software engineering in the wild. Having a dataset with different linked software repositories of such projects, enables deeper diving investigations. This paper presents 20-MAD, a dataset linking the commit and issue data of Mozilla and Apache projects. It includes over 20 years of information about 765 projects, 3.4M commits, 2.3M issues, and 17.3M issue comments, and its compressed size is over 6 GB. The data contains all the typical information about source code commits (e.g., lines added and removed, message and commit time) and issues (status, severity, votes, and summary). The issue comments have been pre-processed for natural language processing and sentiment analysis. This includes emoticons and valence and arousal scores. Linking code repository and issue tracker information, allows studying individuals in two types of repositories and provide more accurate time zone information for issue trackers as well. To our knowledge, this the largest linked dataset in size and in project lifetime that is not based on GitHub. see all
|
ISBN Print: | 978-1-4503-7957-1 |
Pages: | 503 - 507 |
DOI: | 10.1145/3379597.3387487 |
OADOI: | https://oadoi.org/10.1145/3379597.3387487 |
Host publication: |
17th IEEE/ACM International Conference on Mining Software Repositories, MSR 2020, co-located with the 42nd International Conference on Software Engineering. ICSE 2020 |
Conference: |
IEEE/ACM International Conference on Mining Software Repositories |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
213 Electronic, automation and communications engineering, electronics |
Subjects: | |
Funding: |
The authors have been supported by Academy of Finland grants 298020 and 328058. |
Academy of Finland Grant Number: |
298020 328058 |
Detailed Information: |
298020 (Academy of Finland Funding decision) 328058 (Academy of Finland Funding decision) |
Copyright information: |
© 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in 17th IEEE/ACM International Conference on Mining Software Repositories, MSR 2020, co-located with the 42nd International Conference on Software Engineering. ICSE 2020, https://doi.org/10.1145/3379597.3387487. |