Cargando…

Lessons and tips for designing a machine learning study using EHR data

Machine learning (ML) provides the ability to examine massive datasets and uncover patterns within data without relying on a priori assumptions such as specific variable associations, linearity in relationships, or prespecified statistical interactions. However, the application of ML to healthcare d...

Descripción completa

Detalles Bibliográficos
Autores principales: Arbet, Jaron, Brokamp, Cole, Meinzen-Derr, Jareen, Trinkley, Katy E., Spratt, Heidi M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cambridge University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057454/
https://www.ncbi.nlm.nih.gov/pubmed/33948244
http://dx.doi.org/10.1017/cts.2020.513
_version_ 1783680839958134784
author Arbet, Jaron
Brokamp, Cole
Meinzen-Derr, Jareen
Trinkley, Katy E.
Spratt, Heidi M.
author_facet Arbet, Jaron
Brokamp, Cole
Meinzen-Derr, Jareen
Trinkley, Katy E.
Spratt, Heidi M.
author_sort Arbet, Jaron
collection PubMed
description Machine learning (ML) provides the ability to examine massive datasets and uncover patterns within data without relying on a priori assumptions such as specific variable associations, linearity in relationships, or prespecified statistical interactions. However, the application of ML to healthcare data has been met with mixed results, especially when using administrative datasets such as the electronic health record. The black box nature of many ML algorithms contributes to an erroneous assumption that these algorithms can overcome major data issues inherent in large administrative healthcare data. As with other research endeavors, good data and analytic design is crucial to ML-based studies. In this paper, we will provide an overview of common misconceptions for ML, the corresponding truths, and suggestions for incorporating these methods into healthcare research while maintaining a sound study design.
format Online
Article
Text
id pubmed-8057454
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Cambridge University Press
record_format MEDLINE/PubMed
spelling pubmed-80574542021-05-03 Lessons and tips for designing a machine learning study using EHR data Arbet, Jaron Brokamp, Cole Meinzen-Derr, Jareen Trinkley, Katy E. Spratt, Heidi M. J Clin Transl Sci Review Article Machine learning (ML) provides the ability to examine massive datasets and uncover patterns within data without relying on a priori assumptions such as specific variable associations, linearity in relationships, or prespecified statistical interactions. However, the application of ML to healthcare data has been met with mixed results, especially when using administrative datasets such as the electronic health record. The black box nature of many ML algorithms contributes to an erroneous assumption that these algorithms can overcome major data issues inherent in large administrative healthcare data. As with other research endeavors, good data and analytic design is crucial to ML-based studies. In this paper, we will provide an overview of common misconceptions for ML, the corresponding truths, and suggestions for incorporating these methods into healthcare research while maintaining a sound study design. Cambridge University Press 2020-07-24 /pmc/articles/PMC8057454/ /pubmed/33948244 http://dx.doi.org/10.1017/cts.2020.513 Text en © The Association for Clinical and Translational Science 2020 https://creativecommons.org/licenses/by/4.0/This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review Article
Arbet, Jaron
Brokamp, Cole
Meinzen-Derr, Jareen
Trinkley, Katy E.
Spratt, Heidi M.
Lessons and tips for designing a machine learning study using EHR data
title Lessons and tips for designing a machine learning study using EHR data
title_full Lessons and tips for designing a machine learning study using EHR data
title_fullStr Lessons and tips for designing a machine learning study using EHR data
title_full_unstemmed Lessons and tips for designing a machine learning study using EHR data
title_short Lessons and tips for designing a machine learning study using EHR data
title_sort lessons and tips for designing a machine learning study using ehr data
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057454/
https://www.ncbi.nlm.nih.gov/pubmed/33948244
http://dx.doi.org/10.1017/cts.2020.513
work_keys_str_mv AT arbetjaron lessonsandtipsfordesigningamachinelearningstudyusingehrdata
AT brokampcole lessonsandtipsfordesigningamachinelearningstudyusingehrdata
AT meinzenderrjareen lessonsandtipsfordesigningamachinelearningstudyusingehrdata
AT trinkleykatye lessonsandtipsfordesigningamachinelearningstudyusingehrdata
AT sprattheidim lessonsandtipsfordesigningamachinelearningstudyusingehrdata