Cargando…

Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform

BACKGROUND: Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands...

Descripción completa

Detalles Bibliográficos
Autores principales: McPadden, Jacob, Durant, Thomas JS, Bunch, Dustin R, Coppi, Andreas, Price, Nathaniel, Rodgerson, Kris, Torre Jr, Charles J, Byron, William, Hsiao, Allen L, Krumholz, Harlan M, Schulz, Wade L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6477571/
https://www.ncbi.nlm.nih.gov/pubmed/30964441
http://dx.doi.org/10.2196/13043
_version_ 1783413041653612544
author McPadden, Jacob
Durant, Thomas JS
Bunch, Dustin R
Coppi, Andreas
Price, Nathaniel
Rodgerson, Kris
Torre Jr, Charles J
Byron, William
Hsiao, Allen L
Krumholz, Harlan M
Schulz, Wade L
author_facet McPadden, Jacob
Durant, Thomas JS
Bunch, Dustin R
Coppi, Andreas
Price, Nathaniel
Rodgerson, Kris
Torre Jr, Charles J
Byron, William
Hsiao, Allen L
Krumholz, Harlan M
Schulz, Wade L
author_sort McPadden, Jacob
collection PubMed
description BACKGROUND: Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands of health care and designed for scalability and growth. OBJECTIVE: The objectives of our study were to (1) demonstrate the implementation of a data science platform built on open source technology within a large, academic health care system and (2) describe 2 computational health care applications built on such a platform. METHODS: We deployed a data science platform based on several open source technologies to support real-time, big data workloads. We developed data-acquisition workflows for Apache Storm and NiFi in Java and Python to capture patient monitoring and laboratory data for downstream analytics. RESULTS: Emerging data management approaches, along with open source technologies such as Hadoop, can be used to create integrated data lakes to store large, real-time datasets. This infrastructure also provides a robust analytics platform where health care and biomedical research data can be analyzed in near real time for precision medicine and computational health care use cases. CONCLUSIONS: The implementation and use of integrated data science platforms offer organizations the opportunity to combine traditional datasets, including data from the electronic health record, with emerging big data sources, such as continuous patient monitoring and real-time laboratory results. These platforms can enable cost-effective and scalable analytics for the information that will be key to the delivery of precision medicine initiatives. Organizations that can take advantage of the technical advances found in data science platforms will have the opportunity to provide comprehensive access to health care data for computational health care and precision medicine research.
format Online
Article
Text
id pubmed-6477571
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-64775712019-05-08 Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform McPadden, Jacob Durant, Thomas JS Bunch, Dustin R Coppi, Andreas Price, Nathaniel Rodgerson, Kris Torre Jr, Charles J Byron, William Hsiao, Allen L Krumholz, Harlan M Schulz, Wade L J Med Internet Res Original Paper BACKGROUND: Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands of health care and designed for scalability and growth. OBJECTIVE: The objectives of our study were to (1) demonstrate the implementation of a data science platform built on open source technology within a large, academic health care system and (2) describe 2 computational health care applications built on such a platform. METHODS: We deployed a data science platform based on several open source technologies to support real-time, big data workloads. We developed data-acquisition workflows for Apache Storm and NiFi in Java and Python to capture patient monitoring and laboratory data for downstream analytics. RESULTS: Emerging data management approaches, along with open source technologies such as Hadoop, can be used to create integrated data lakes to store large, real-time datasets. This infrastructure also provides a robust analytics platform where health care and biomedical research data can be analyzed in near real time for precision medicine and computational health care use cases. CONCLUSIONS: The implementation and use of integrated data science platforms offer organizations the opportunity to combine traditional datasets, including data from the electronic health record, with emerging big data sources, such as continuous patient monitoring and real-time laboratory results. These platforms can enable cost-effective and scalable analytics for the information that will be key to the delivery of precision medicine initiatives. Organizations that can take advantage of the technical advances found in data science platforms will have the opportunity to provide comprehensive access to health care data for computational health care and precision medicine research. JMIR Publications 2019-04-09 /pmc/articles/PMC6477571/ /pubmed/30964441 http://dx.doi.org/10.2196/13043 Text en ©Jacob McPadden, Thomas JS Durant, Dustin R Bunch, Andreas Coppi, Nathaniel Price, Kris Rodgerson, Charles J Torre Jr, William Byron, Allen L Hsiao, Harlan M Krumholz, Wade L Schulz. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 09.04.2019. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
McPadden, Jacob
Durant, Thomas JS
Bunch, Dustin R
Coppi, Andreas
Price, Nathaniel
Rodgerson, Kris
Torre Jr, Charles J
Byron, William
Hsiao, Allen L
Krumholz, Harlan M
Schulz, Wade L
Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
title Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
title_full Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
title_fullStr Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
title_full_unstemmed Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
title_short Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
title_sort health care and precision medicine research: analysis of a scalable data science platform
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6477571/
https://www.ncbi.nlm.nih.gov/pubmed/30964441
http://dx.doi.org/10.2196/13043
work_keys_str_mv AT mcpaddenjacob healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT durantthomasjs healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT bunchdustinr healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT coppiandreas healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT pricenathaniel healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT rodgersonkris healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT torrejrcharlesj healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT byronwilliam healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT hsiaoallenl healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT krumholzharlanm healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform
AT schulzwadel healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform