University of Oulu

Pecorelli, F., Lujan, S., Lenarduzzi, V. et al. On the adequacy of static analysis warnings with respect to code smell prediction. Empir Software Eng 27, 64 (2022).

On the adequacy of static analysis warnings with respect to code smell prediction

Saved in:
Author: Pecorelli, Fabiano1; Lujan, Savanna1; Lenarduzzi, Valentina2,3;
Organizations: 1Tampere University, Tampere, Finland
2LUT University, Lappeenranta, Finland
3University of Oulu, Oulu, Finland
4SeSa Lab, University of Salerno, Fisciano, Italy
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 2.7 MB)
Persistent link:
Language: English
Published: Springer Nature, 2022
Publish Date: 2022-05-10


Code smells are poor implementation choices that developers apply while evolving source code and that affect program maintainability. Multiple automated code smell detectors have been proposed: while most of them relied on heuristics applied over software metrics, a recent trend concerns the definition of machine learning techniques. However, machine learning-based code smell detectors still suffer from low accuracy: one of the causes is the lack of adequate features to feed machine learners. In this paper, we face this issue by investigating the role of static analysis warnings generated by three state-of-the-art tools to be used as features of machine learning models for the detection of seven code smell types. We conduct a three-step study in which we (1) verify the relation between static analysis warnings and code smells and the potential predictive power of these warnings; (2) build code smell prediction models exploiting and combining the most relevant features coming from the first analysis; (3) compare and combine the performance of the best code smell prediction model with the one achieved by a state of the art approach. The results reveal the low performance of the models exploiting static analysis warnings alone, while we observe significant improvements when combining the warnings with additional code metrics. Nonetheless, we still find that the best model does not perform better than a random model, hence leaving open the challenges related to the definition of ad-hoc features for code smell prediction.

see all

Series: Empirical software engineering
ISSN: 1382-3256
ISSN-E: 1573-7616
ISSN-L: 1382-3256
Volume: 27
Article number: 64
DOI: 10.1007/s10664-022-10126-5
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Funding: Fabio acknowledges the support of the Swiss National Science Foundation through the SNF Project No. PZ00P2 186090 (TED).
Copyright information: © The Author(s) 2022. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit