University of Oulu

Rasheed Omobolaji Alabi, Mohammed Elmusrati, Iris Sawazaki‐Calone, Luiz Paulo Kowalski, Caj Haglund, Ricardo D. Coletta, Antti A. Mäkitie, Tuula Salo, Alhadi Almangush, Ilmo Leivo, Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer, International Journal of Medical Informatics, Volume 136, 2020, 104068, ISSN 1386-5056,

Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer

Saved in:
Author: Alabi, Rasheed Omobolaji1; Elmusrati, Mohammed1; Sawazaki‐Calone, Iris2;
Organizations: 1Department of Industrial Digitalization, School of Technology and Innovations, University of Vaasa, Vaasa, Finland
2Oral Pathology and Oral Medicine, Dentistry School, Western Parana State University, Cascavel, PR, Brazil
3Department of Head and Neck Surgery and Otorhinolaryngology, A.C. Camargo Cancer Center, São Paulo, SP, Brazil
4Research Programs Unit, Translational Cancer Biology, University of Helsinki, Helsinki, Finland
5Department of Oral Diagnosis, School of Dentistry, State University of Campinas, Piracicaba, São Paulo, Brazil
6Department of Otorhinolaryngology – Head and Neck Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
7Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
8Division of Ear, Nose and Throat Diseases, Department of Clinical Sciences, Intervention and Technology, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
9Department of Pathology, University of Helsinki, Helsinki, Finland
10Department of Oral and Maxillofacial Diseases, University of Helsinki, Helsinki, Finland
11Cancer and Translational Medicine Research Unit, Medical Research Center Oulu, University of Oulu and Oulu University Hospital, Oulu, Finland
12Institute of Biomedicine, Pathology, University of Turku, Turku, Finland
13Faculty of Dentistry, University of Misurata, Misurata, Libya
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 2.8 MB)
Persistent link:
Language: English
Published: Elsevier, 2020
Publish Date: 2020-12-28


Background: The proper estimate of the risk of recurrences in early-stage oral tongue squamous cell carcinoma (OTSCC) is mandatory for individual treatment-decision making. However, this remains a challenge even for experienced multidisciplinary centers.

Objectives: We compared the performance of four machine learning (ML) algorithms for predicting the risk of locoregional recurrences in patients with OTSCC. These algorithms were Support Vector Machine (SVM), Naive Bayes (NB), Boosted Decision Tree (BDT), and Decision Forest (DF).

Materials and methods: The study cohort comprised 311 cases from the five University Hospitals in Finland and A.C. Camargo Cancer Center, São Paulo, Brazil. For comparison of the algorithms, we used the harmonic mean of precision and recall called F1 score, specificity, and accuracy values. These algorithms and their corresponding permutation feature importance (PFI) with the input parameters were externally tested on 59 new cases. Furthermore, we compared the performance of the algorithm that showed the highest prediction accuracy with the prognostic significance of depth of invasion (DOI).

Results: The results showed that the average specificity of all the algorithms was 71% The SVM showed an accuracy of 68% and F1 score of 0.63, NB an accuracy of 70% and F1 score of 0.64, BDT an accuracy of 81% and F1 score of 0.78, and DF an accuracy of 78% and F1 score of 0.70. Additionally, these algorithms outperformed the DOI-based approach, which gave an accuracy of 63%. With PFI-analysis, there was no significant difference in the overall accuracies of three of the algorithms; PFI-BDT accuracy increased to 83.1%, PFI-DF increased to 80%, PFI-SVM decreased to 64.4%, while PFI-NB accuracy increased significantly to 81.4%.

Conclusions: Our findings show that the best classification accuracy was achieved with the boosted decision tree algorithm. Additionally, these algorithms outperformed the DOI-based approach. Furthermore, with few parameters identified in the PFI analysis, ML technique still showed the ability to predict locoregional recurrence. The application of boosted decision tree machine learning algorithm can stratify OTSCC patients and thus aid in their individual treatment planning.

see all

Series: International journal of medical informatics
ISSN: 1386-5056
ISSN-E: 1872-8243
ISSN-L: 1386-5056
Volume: 136
Article number: 104068
DOI: 10.1016/j.ijmedinf.2019.104068
Type of Publication: A1 Journal article – refereed
Field of Science: 3122 Cancers
113 Computer and information sciences
Funding: We would like to include the funding as follow: The School of Technology and Innovations, University of Vaasa Scholarship Fund. Turku University Hospital Research Fund, Helsinki University Hospital Research Fund, and the Finnish Cancer Society.
Copyright information: © 2020 Elsevier B.V. This manuscript version is made available under the CC-BY-NC-ND 4.0 license