Cargando…
Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods
BACKGROUND: To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technolog...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5596298/ https://www.ncbi.nlm.nih.gov/pubmed/28851678 http://dx.doi.org/10.2196/resprot.7757 |
_version_ | 1783263505372151808 |
---|---|
author | Luo, Gang Stone, Bryan L Johnson, Michael D Tarczy-Hornoch, Peter Wilcox, Adam B Mooney, Sean D Sheng, Xiaoming Haug, Peter J Nkoy, Flory L |
author_facet | Luo, Gang Stone, Bryan L Johnson, Michael D Tarczy-Hornoch, Peter Wilcox, Adam B Mooney, Sean D Sheng, Xiaoming Haug, Peter J Nkoy, Flory L |
author_sort | Luo, Gang |
collection | PubMed |
description | BACKGROUND: To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning has won most data science competitions and could support many clinical activities, yet only 15% of hospitals use it for even limited purposes. Despite familiarity with data, health care researchers often lack machine learning expertise to directly use clinical big data, creating a hurdle in realizing value from their data. Health care researchers can work with data scientists with deep machine learning knowledge, but it takes time and effort for both parties to communicate effectively. Facing a shortage in the United States of data scientists and hiring competition from companies with deep pockets, health care systems have difficulty recruiting data scientists. Building and generalizing a machine learning model often requires hundreds to thousands of manual iterations by data scientists to select the following: (1) hyper-parameter values and complex algorithms that greatly affect model accuracy and (2) operators and periods for temporally aggregating clinical attributes (eg, whether a patient’s weight kept rising in the past year). This process becomes infeasible with limited budgets. OBJECTIVE: This study’s goal is to enable health care researchers to directly use clinical big data, make machine learning feasible with limited budgets and data scientist resources, and realize value from data. METHODS: This study will allow us to achieve the following: (1) finish developing the new software, Automated Machine Learning (Auto-ML), to automate model selection for machine learning with clinical big data and validate Auto-ML on seven benchmark modeling problems of clinical importance; (2) apply Auto-ML and novel methodology to two new modeling problems crucial for care management allocation and pilot one model with care managers; and (3) perform simulations to estimate the impact of adopting Auto-ML on US patient outcomes. RESULTS: We are currently writing Auto-ML’s design document. We intend to finish our study by around the year 2022. CONCLUSIONS: Auto-ML will generalize to various clinical prediction/classification problems. With minimal help from data scientists, health care researchers can use Auto-ML to quickly build high-quality models. This will boost wider use of machine learning in health care and improve patient outcomes. |
format | Online Article Text |
id | pubmed-5596298 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-55962982017-09-20 Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods Luo, Gang Stone, Bryan L Johnson, Michael D Tarczy-Hornoch, Peter Wilcox, Adam B Mooney, Sean D Sheng, Xiaoming Haug, Peter J Nkoy, Flory L JMIR Res Protoc Proposal BACKGROUND: To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning has won most data science competitions and could support many clinical activities, yet only 15% of hospitals use it for even limited purposes. Despite familiarity with data, health care researchers often lack machine learning expertise to directly use clinical big data, creating a hurdle in realizing value from their data. Health care researchers can work with data scientists with deep machine learning knowledge, but it takes time and effort for both parties to communicate effectively. Facing a shortage in the United States of data scientists and hiring competition from companies with deep pockets, health care systems have difficulty recruiting data scientists. Building and generalizing a machine learning model often requires hundreds to thousands of manual iterations by data scientists to select the following: (1) hyper-parameter values and complex algorithms that greatly affect model accuracy and (2) operators and periods for temporally aggregating clinical attributes (eg, whether a patient’s weight kept rising in the past year). This process becomes infeasible with limited budgets. OBJECTIVE: This study’s goal is to enable health care researchers to directly use clinical big data, make machine learning feasible with limited budgets and data scientist resources, and realize value from data. METHODS: This study will allow us to achieve the following: (1) finish developing the new software, Automated Machine Learning (Auto-ML), to automate model selection for machine learning with clinical big data and validate Auto-ML on seven benchmark modeling problems of clinical importance; (2) apply Auto-ML and novel methodology to two new modeling problems crucial for care management allocation and pilot one model with care managers; and (3) perform simulations to estimate the impact of adopting Auto-ML on US patient outcomes. RESULTS: We are currently writing Auto-ML’s design document. We intend to finish our study by around the year 2022. CONCLUSIONS: Auto-ML will generalize to various clinical prediction/classification problems. With minimal help from data scientists, health care researchers can use Auto-ML to quickly build high-quality models. This will boost wider use of machine learning in health care and improve patient outcomes. JMIR Publications 2017-08-29 /pmc/articles/PMC5596298/ /pubmed/28851678 http://dx.doi.org/10.2196/resprot.7757 Text en ©Gang Luo, Bryan L Stone, Michael D Johnson, Peter Tarczy-Hornoch, Adam B Wilcox, Sean D Mooney, Xiaoming Sheng, Peter J Haug, Flory L Nkoy. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 29.08.2017. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on http://www.researchprotocols.org, as well as this copyright and license information must be included. |
spellingShingle | Proposal Luo, Gang Stone, Bryan L Johnson, Michael D Tarczy-Hornoch, Peter Wilcox, Adam B Mooney, Sean D Sheng, Xiaoming Haug, Peter J Nkoy, Flory L Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods |
title | Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods |
title_full | Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods |
title_fullStr | Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods |
title_full_unstemmed | Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods |
title_short | Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods |
title_sort | automating construction of machine learning models with clinical big data: proposal rationale and methods |
topic | Proposal |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5596298/ https://www.ncbi.nlm.nih.gov/pubmed/28851678 http://dx.doi.org/10.2196/resprot.7757 |
work_keys_str_mv | AT luogang automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods AT stonebryanl automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods AT johnsonmichaeld automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods AT tarczyhornochpeter automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods AT wilcoxadamb automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods AT mooneyseand automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods AT shengxiaoming automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods AT haugpeterj automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods AT nkoyfloryl automatingconstructionofmachinelearningmodelswithclinicalbigdataproposalrationaleandmethods |