Cargando…

Predicting Low Cognitive Ability at Age 5—Feature Selection Using Machine Learning Methods and Birth Cohort Data

Objectives: In this study, we applied the random forest (RF) algorithm to birth-cohort data to train a model to predict low cognitive ability at 5 years of age and to identify the important predictive features. Methods: Data was from 1,070 participants in the Irish population-based BASELINE cohort....

Descripción completa

Detalles Bibliográficos
Autores principales: Bowe, Andrea K., Lightbody, Gordon, Staines, Anthony, Kiely, Mairead E., McCarthy, Fergus P., Murray, Deirdre M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9684182/
https://www.ncbi.nlm.nih.gov/pubmed/36439276
http://dx.doi.org/10.3389/ijph.2022.1605047
_version_ 1784835223000711168
author Bowe, Andrea K.
Lightbody, Gordon
Staines, Anthony
Kiely, Mairead E.
McCarthy, Fergus P.
Murray, Deirdre M.
author_facet Bowe, Andrea K.
Lightbody, Gordon
Staines, Anthony
Kiely, Mairead E.
McCarthy, Fergus P.
Murray, Deirdre M.
author_sort Bowe, Andrea K.
collection PubMed
description Objectives: In this study, we applied the random forest (RF) algorithm to birth-cohort data to train a model to predict low cognitive ability at 5 years of age and to identify the important predictive features. Methods: Data was from 1,070 participants in the Irish population-based BASELINE cohort. A RF model was trained to predict an intelligence quotient (IQ) score ≤90 at age 5 years using maternal, infant, and sociodemographic features. Feature importance was examined and internal validation performed using 10-fold cross validation repeated 5 times. Results The five most important predictive features were the total years of maternal schooling, infant Apgar score at 1 min, socioeconomic index, maternal BMI, and alcohol consumption in the first trimester. On internal validation a parsimonious RF model based on 11 features showed excellent predictive ability, correctly classifying 95% of participants. This provides a foundation suitable for external validation in an unseen cohort. Conclusion: Machine learning approaches to large existing datasets can provide accurate feature selection to improve risk prediction. Further validation of this model is required in cohorts representative of the general population.
format Online
Article
Text
id pubmed-9684182
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-96841822022-11-25 Predicting Low Cognitive Ability at Age 5—Feature Selection Using Machine Learning Methods and Birth Cohort Data Bowe, Andrea K. Lightbody, Gordon Staines, Anthony Kiely, Mairead E. McCarthy, Fergus P. Murray, Deirdre M. Int J Public Health Public Health Archive Objectives: In this study, we applied the random forest (RF) algorithm to birth-cohort data to train a model to predict low cognitive ability at 5 years of age and to identify the important predictive features. Methods: Data was from 1,070 participants in the Irish population-based BASELINE cohort. A RF model was trained to predict an intelligence quotient (IQ) score ≤90 at age 5 years using maternal, infant, and sociodemographic features. Feature importance was examined and internal validation performed using 10-fold cross validation repeated 5 times. Results The five most important predictive features were the total years of maternal schooling, infant Apgar score at 1 min, socioeconomic index, maternal BMI, and alcohol consumption in the first trimester. On internal validation a parsimonious RF model based on 11 features showed excellent predictive ability, correctly classifying 95% of participants. This provides a foundation suitable for external validation in an unseen cohort. Conclusion: Machine learning approaches to large existing datasets can provide accurate feature selection to improve risk prediction. Further validation of this model is required in cohorts representative of the general population. Frontiers Media S.A. 2022-11-10 /pmc/articles/PMC9684182/ /pubmed/36439276 http://dx.doi.org/10.3389/ijph.2022.1605047 Text en Copyright © 2022 Bowe, Lightbody, Staines, Kiely, McCarthy and Murray. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Public Health Archive
Bowe, Andrea K.
Lightbody, Gordon
Staines, Anthony
Kiely, Mairead E.
McCarthy, Fergus P.
Murray, Deirdre M.
Predicting Low Cognitive Ability at Age 5—Feature Selection Using Machine Learning Methods and Birth Cohort Data
title Predicting Low Cognitive Ability at Age 5—Feature Selection Using Machine Learning Methods and Birth Cohort Data
title_full Predicting Low Cognitive Ability at Age 5—Feature Selection Using Machine Learning Methods and Birth Cohort Data
title_fullStr Predicting Low Cognitive Ability at Age 5—Feature Selection Using Machine Learning Methods and Birth Cohort Data
title_full_unstemmed Predicting Low Cognitive Ability at Age 5—Feature Selection Using Machine Learning Methods and Birth Cohort Data
title_short Predicting Low Cognitive Ability at Age 5—Feature Selection Using Machine Learning Methods and Birth Cohort Data
title_sort predicting low cognitive ability at age 5—feature selection using machine learning methods and birth cohort data
topic Public Health Archive
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9684182/
https://www.ncbi.nlm.nih.gov/pubmed/36439276
http://dx.doi.org/10.3389/ijph.2022.1605047
work_keys_str_mv AT boweandreak predictinglowcognitiveabilityatage5featureselectionusingmachinelearningmethodsandbirthcohortdata
AT lightbodygordon predictinglowcognitiveabilityatage5featureselectionusingmachinelearningmethodsandbirthcohortdata
AT stainesanthony predictinglowcognitiveabilityatage5featureselectionusingmachinelearningmethodsandbirthcohortdata
AT kielymaireade predictinglowcognitiveabilityatage5featureselectionusingmachinelearningmethodsandbirthcohortdata
AT mccarthyfergusp predictinglowcognitiveabilityatage5featureselectionusingmachinelearningmethodsandbirthcohortdata
AT murraydeirdrem predictinglowcognitiveabilityatage5featureselectionusingmachinelearningmethodsandbirthcohortdata