Cargando…

Predicting population health with machine learning: a scoping review

OBJECTIVE: To determine how machine learning has been applied to prediction applications in population health contexts. Specifically, to describe which outcomes have been studied, the data sources most widely used and whether reporting of machine learning predictive models aligns with established re...

Descripción completa

Detalles Bibliográficos
Autores principales:	Morgenstern, Jason Denzil, Buajitti, Emmalin, O’Neill, Meghan, Piggott, Thomas, Goel, Vivek, Fridman, Daniel, Kornas, Kathy, Rosella, Laura C
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BMJ Publishing Group 2020
Materias:	Public Health
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592293/ https://www.ncbi.nlm.nih.gov/pubmed/33109649 http://dx.doi.org/10.1136/bmjopen-2020-037860

_version_	1783601156290772992
author	Morgenstern, Jason Denzil Buajitti, Emmalin O’Neill, Meghan Piggott, Thomas Goel, Vivek Fridman, Daniel Kornas, Kathy Rosella, Laura C
author_facet	Morgenstern, Jason Denzil Buajitti, Emmalin O’Neill, Meghan Piggott, Thomas Goel, Vivek Fridman, Daniel Kornas, Kathy Rosella, Laura C
author_sort	Morgenstern, Jason Denzil
collection	PubMed
description	OBJECTIVE: To determine how machine learning has been applied to prediction applications in population health contexts. Specifically, to describe which outcomes have been studied, the data sources most widely used and whether reporting of machine learning predictive models aligns with established reporting guidelines. DESIGN: A scoping review. DATA SOURCES: MEDLINE, EMBASE, CINAHL, ProQuest, Scopus, Web of Science, Cochrane Library, INSPEC and ACM Digital Library were searched on 18 July 2018. ELIGIBILITY CRITERIA: We included English articles published between 1980 and 2018 that used machine learning to predict population-health-related outcomes. We excluded studies that only used logistic regression or were restricted to a clinical context. DATA EXTRACTION AND SYNTHESIS: We summarised findings extracted from published reports, which included general study characteristics, aspects of model development, reporting of results and model discussion items. RESULTS: Of 22 618 articles found by our search, 231 were included in the review. The USA (n=71, 30.74%) and China (n=40, 17.32%) produced the most studies. Cardiovascular disease (n=22, 9.52%) was the most studied outcome. The median number of observations was 5414 (IQR=16 543.5) and the median number of features was 17 (IQR=31). Health records (n=126, 54.5%) and investigator-generated data (n=86, 37.2%) were the most common data sources. Many studies did not incorporate recommended guidelines on machine learning and predictive modelling. Predictive discrimination was commonly assessed using area under the receiver operator curve (n=98, 42.42%) and calibration was rarely assessed (n=22, 9.52%). CONCLUSIONS: Machine learning applications in population health have concentrated on regions and diseases well represented in traditional data sources, infrequently using big data. Important aspects of model development were under-reported. Greater use of big data and reporting guidelines for predictive modelling could improve machine learning applications in population health. REGISTRATION NUMBER: Registered on the Open Science Framework on 17 July 2018 (available at https://osf.io/rnqe6/).
format	Online Article Text
id	pubmed-7592293
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BMJ Publishing Group
record_format	MEDLINE/PubMed
spelling	pubmed-75922932020-10-29 Predicting population health with machine learning: a scoping review Morgenstern, Jason Denzil Buajitti, Emmalin O’Neill, Meghan Piggott, Thomas Goel, Vivek Fridman, Daniel Kornas, Kathy Rosella, Laura C BMJ Open Public Health OBJECTIVE: To determine how machine learning has been applied to prediction applications in population health contexts. Specifically, to describe which outcomes have been studied, the data sources most widely used and whether reporting of machine learning predictive models aligns with established reporting guidelines. DESIGN: A scoping review. DATA SOURCES: MEDLINE, EMBASE, CINAHL, ProQuest, Scopus, Web of Science, Cochrane Library, INSPEC and ACM Digital Library were searched on 18 July 2018. ELIGIBILITY CRITERIA: We included English articles published between 1980 and 2018 that used machine learning to predict population-health-related outcomes. We excluded studies that only used logistic regression or were restricted to a clinical context. DATA EXTRACTION AND SYNTHESIS: We summarised findings extracted from published reports, which included general study characteristics, aspects of model development, reporting of results and model discussion items. RESULTS: Of 22 618 articles found by our search, 231 were included in the review. The USA (n=71, 30.74%) and China (n=40, 17.32%) produced the most studies. Cardiovascular disease (n=22, 9.52%) was the most studied outcome. The median number of observations was 5414 (IQR=16 543.5) and the median number of features was 17 (IQR=31). Health records (n=126, 54.5%) and investigator-generated data (n=86, 37.2%) were the most common data sources. Many studies did not incorporate recommended guidelines on machine learning and predictive modelling. Predictive discrimination was commonly assessed using area under the receiver operator curve (n=98, 42.42%) and calibration was rarely assessed (n=22, 9.52%). CONCLUSIONS: Machine learning applications in population health have concentrated on regions and diseases well represented in traditional data sources, infrequently using big data. Important aspects of model development were under-reported. Greater use of big data and reporting guidelines for predictive modelling could improve machine learning applications in population health. REGISTRATION NUMBER: Registered on the Open Science Framework on 17 July 2018 (available at https://osf.io/rnqe6/). BMJ Publishing Group 2020-10-27 /pmc/articles/PMC7592293/ /pubmed/33109649 http://dx.doi.org/10.1136/bmjopen-2020-037860 Text en © Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. http://creativecommons.org/licenses/by-nc/4.0/ http://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle	Public Health Morgenstern, Jason Denzil Buajitti, Emmalin O’Neill, Meghan Piggott, Thomas Goel, Vivek Fridman, Daniel Kornas, Kathy Rosella, Laura C Predicting population health with machine learning: a scoping review
title	Predicting population health with machine learning: a scoping review
title_full	Predicting population health with machine learning: a scoping review
title_fullStr	Predicting population health with machine learning: a scoping review
title_full_unstemmed	Predicting population health with machine learning: a scoping review
title_short	Predicting population health with machine learning: a scoping review
title_sort	predicting population health with machine learning: a scoping review
topic	Public Health
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592293/ https://www.ncbi.nlm.nih.gov/pubmed/33109649 http://dx.doi.org/10.1136/bmjopen-2020-037860
work_keys_str_mv	AT morgensternjasondenzil predictingpopulationhealthwithmachinelearningascopingreview AT buajittiemmalin predictingpopulationhealthwithmachinelearningascopingreview AT oneillmeghan predictingpopulationhealthwithmachinelearningascopingreview AT piggottthomas predictingpopulationhealthwithmachinelearningascopingreview AT goelvivek predictingpopulationhealthwithmachinelearningascopingreview AT fridmandaniel predictingpopulationhealthwithmachinelearningascopingreview AT kornaskathy predictingpopulationhealthwithmachinelearningascopingreview AT rosellalaurac predictingpopulationhealthwithmachinelearningascopingreview

Predicting population health with machine learning: a scoping review

Ejemplares similares