Evaluating and enhancing the generalization performance of machine learning models for physical activity intensity prediction from raw acceleration data

Farrahi, Vahid; Niemelä, Maisa; Tjurin, Petra; Kangas, Maarit; Korpelainen, Raija; Jämsä, Timo

Evaluating and enhancing the generalization performance of machine learning models for physical activity intensity prediction from raw acceleration data

Farrahi, Vahid; Niemelä, Maisa; Tjurin, Petra; Kangas, Maarit; Korpelainen, Raija; Jämsä, Timo (2019-05-20)

Avaa tiedosto

nbnfi-fe2019091828613.pdf (5.222Mt)

nbnfi-fe2019091828613_meta.xml (42.75Kt)

nbnfi-fe2019091828613_solr.xml (41.03Kt)

Lataukset:

URL:

https://doi.org/10.1109/JBHI.2019.2917565

Farrahi, Vahid

Niemelä, Maisa

Tjurin, Petra

Kangas, Maarit

Korpelainen, Raija

Jämsä, Timo

Institute of Electrical and Electronics Engineers

20.05.2019

V. Farrahi, M. Niemelä, P. Tjurin, M. Kangas, R. Korpelainen and T. Jämsä, "Evaluating and Enhancing the Generalization Performance of Machine Learning Models for Physical Activity Intensity Prediction From Raw Acceleration Data," in IEEE Journal of Biomedical and Health Informatics, vol. 24, no. 1, pp. 27-38, Jan. 2020. doi: 10.1109/JBHI.2019.2917565

https://creativecommons.org/licenses/by/3.0/
© The Authors 2019. This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
https://creativecommons.org/licenses/by/3.0/

doi:https://doi.org/10.1109/JBHI.2019.2917565

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2019091828613

Tiivistelmä

Abstract

Purpose: To evaluate and enhance the generalization performance of machine learning physical activity intensity prediction models developed with raw acceleration data on populations monitored by different activity monitors.

Method: Five datasets from four studies, each containing only hip- or wrist-based raw acceleration data (two hip- and three wrist-based) were extracted. The five datasets were then used to develop and validate artificial neural networks (ANN) in three setups to classify activity intensity categories (sedentary behavior, light, and moderate-to-vigorous). To examine generalizability, the ANN models were developed using within-dataset (leave-one-subject-out) cross-validation, and then cross-tested to other datasets with different accelerometers. To enhance the models’ generalizability, a combination of four of the five datasets was used for training and the fifth dataset for validation. Finally, all the five datasets were merged to develop a single model that is generalizable across the datasets (50% of the subjects from each dataset for training, the remaining for validation).

Results: The datasets showed high performance in within-dataset cross-validation (accuracy 71.9–95.4%, Kappa K=0.63–0.94). The performance of the within-dataset validated models decreased when applied to datasets with different accelerometers (41.2–59.9%, K=0.21–0.48). The trained models on merged datasets consisting hip and wrist data predicted the left-out dataset with acceptable performance (65.9–83.7%, K=0.61–0.79). The model trained with all five datasets performed with acceptable performance across the datasets (80.4–90.7%, K=0.68–0.89).

Conclusions: Integrating heterogeneous datasets in training sets seems a viable approach for enhancing the generalization performance of the models. Instead, within-dataset validation is not sufficient to understand the models’ performance on other populations with different accelerometers.

Kokoelmat

Avoin saatavuus [31941]

Ellei muuten mainita, aineiston lisenssi on https://creativecommons.org/licenses/by/3.0/