University of Oulu

Sebastian Rauschert, Phillip E. Melton, Anni Heiskala, Ville Karhunen, Graham Burdge, Jeffrey M. Craig, Keith M. Godfrey, Karen Lillycrop, Trevor A. Mori, Lawrence J. Beilin, Wendy H. Oddy, Craig Pennell, Marjo-Riitta Järvelin, Sylvain Sebert and Rae-Chi Huang (2020) Machine Learning-Based DNA Methylation Score for Fetal Exposure to Maternal Smoking: Development and Validation in Samples Collected from Adolescents and Adults, Environmental Health Perspectives 128:9 CID: 097003

Machine learning-based DNA methylation score for fetal exposure to maternal smoking : development and validation in samples collected from adolescents and adults

Saved in:
Author: Rauschert, Sebastian1; Melton, Phillip E.2,3,4; Heiskala, Anni5;
Organizations: 1Telethon Kids Institute, University of Western Australia, Nedlands, Perth, Western Australia, Australia
2Centre for Genetic Origins of Health and Disease, University of Western Australia, Perth, Australia
3School of Pharmacy and Biomedical Sciences, Faculty of Health Sciences, Curtin University, Perth, Australia
4Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
5Center for Life Course Health Research, University of Oulu, Oulu, Finland
6Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, Imperial College London, London, UK
7Institute of Developmental Sciences, University of Southampton, Faculty of Medicine, Southampton, UK
8Centre for Molecular and Medical Research, School of Medicine, Deakin University, Geelong, Victoria, Australia
9Molecular Epidemiology, Murdoch Children’s Research Institute, Parkville, Australia
10MRC Lifecourse Epidemiology Unit and NIHR Southampton Biomedical Research Centre, University of Southampton and University Hospital Southampton NHS Foundation Trust, Southampton, UK
11Biological Sciences, Faculty of Natural and Environmental Sciences, University of Southampton, Southampton, Hampshire, UK
12Medical School, Royal Perth Hospital Unit, University of Western Australia, Perth, Western Australia
13School of Medicine and Public Health, University of Newcastle, Newcastle, New South Wales, Australia
14Unit of Primary Care, Oulu University Hospital, Oulu, Finland
15Department of Metabolism, Digestion and Reproduction, Genomic Medicine, Imperial College London, London, UK
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 0.4 MB)
Persistent link:
Language: English
Published: National Institute of Environmental Health Sciences, 2020
Publish Date: 2021-02-17


Background: Fetal exposure to maternal smoking during pregnancy is associated with the development of noncommunicable diseases in the offspring. Maternal smoking may induce such long-term effects through persistent changes in the DNA methylome, which therefore hold the potential to be used as a biomarker of this early life exposure. With declining costs for measuring DNA methylation, we aimed to develop a DNA methylation score that can be used on adolescent DNA methylation data and thereby generate a score for in utero cigarette smoke exposure.

Methods: We used machine learning methods to create a score reflecting exposure to maternal smoking during pregnancy. This score is based on peripheral blood measurements of DNA methylation (Illumina’s Infinium HumanMethylation450K BeadChip). The score was developed and tested in the Raine Study with data from 995 white 17-y-old participants using 10-fold cross-validation. The score was further tested and validated in independent data from the Northern Finland Birth Cohort 1986 (NFBC1986) (16-y-olds) and 1966 (NFBC1966) (31-y-olds). Further, three previously proposed DNA methylation scores were applied for comparison. The final score was developed with 204 CpGs using elastic net regression.

Results: Sensitivity and specificity values for the best performing previously developed classifier (“Reese Score”) were 88% and 72% for Raine, 87% and 61% for NFBC1986 and 72% and 70% for NFBC1966, respectively; corresponding figures using the elastic net regression approach were 91% and 76% (Raine), 87% and 75% (NFBC1986), and 72% and 78% for NFBC1966.

Conclusion: We have developed a DNA methylation score for exposure to maternal smoking during pregnancy, outperforming the three previously developed scores. One possible application of the current score could be for model adjustment purposes or to assess its association with distal health outcomes where part of the effect can be attributed to maternal smoking. Further, it may provide a biomarker for fetal exposure to maternal smoking.

see all

Series: Environmental health perspectives
ISSN: 0091-6765
ISSN-E: 1552-9924
ISSN-L: 0091-6765
Volume: 128
Issue: 9
Pages: 1 - 11
Article number: 097003
DOI: 10.1289/EHP6076
Type of Publication: A1 Journal article – refereed
Field of Science: 3142 Public health care science, environmental and occupational health
Funding: This work was supported by resources provided by The Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia. S.R. received funding for this work from the European Union’s Horizon 2020 project LifeCycle, as an awardee of the LifeCycle fellowship 2018. The DNA methylation work was supported by NHMRC grant 1059711. R.C.H. and T.A.M. are supported by NHMRC fellowships (grant number 1053384 and 1042255, respectively). K.M.G. is supported by the UK Medical Research Council (MC_UU_12011/4), the National Institute for Health Research [as an NIHR Senior Investigator (NF-SI-0515-10042) and through the NIHR Southampton Biomedical Research Centre] and the European Union’s Erasmus+ Capacity-Building ENeASEA Project and Seventh Framework Programme (FP7/2007-2013), projects EarlyNutrition and ODIN under grant agreement numbers 289346 and 613977. NFBC1966 received financial support from University of Oulu Grant no. 65354, Oulu University Hospital grant no. 2/97, 8/97; Ministry of Health and Social Affairs grant no. 23/251/97, 160/97, 190/97; National Institute for Health and Welfare Helsinki grant no. 54121; and Regional Institute of Occupational Health, Oulu, Finland grant no. 50621, 54231. NFBC1986 received financial support from EU QLG1-CT-2000-01643 (EUROBLCS) grant no. E51560; NorFA grant no. 731, 20056, 30167; USA/NIHH 2000 G DF682 grant no. 50945. S.S., A.H., V.K., and M.R.J. received support by H2020-633595 DynaHEALTH, H2020 733206 LifeCycle, the academy of Finland EGEA-project (285547), the Biocenter Oulu, H2020-733206 LifeCycle, H2020-824989 EUCANConnect, EU-H2020 (grant no. 82576), EU-H2020 EarlyCause (grant no. 848158), EU-H2020 LongITools (grant no. 873749), EU H2020-MSCA-ITN-2016 CAPICE Marie Sklodowska-Curie grant (grant no. 721567), and the Medical Research Council, UK [grant nos. MR/M013138/1, MRC/BBSRC MR/S03658X/1 (the EU JPI HDHL)]. V.K. is funded by the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant (721567).
EU Grant Number: (633595) DYNAHEALTH - Understanding the dynamic determinants of glucose homeostasis and social capability to promote Healthy and active aging
(733206) LIFECYCLE - Early-life stressors and LifeCycle health
(824989) EUCAN-Connect - A federated FAIR platform enabling large-scale analysis of high-value cohort data connecting Europe and Canada in personalized health
(848158) EarlyCause - Causative mechanisms & integrative models linking early-life-stress to psycho-cardio-metabolic multi-morbidity
Academy of Finland Grant Number: 285547
Detailed Information: 285547 (Academy of Finland Funding decision)
Copyright information: © 2020 The Authors. Reproduced with permission from Environmental Health Perspectives.