Cargando…
Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
BACKGROUND: Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6477571/ https://www.ncbi.nlm.nih.gov/pubmed/30964441 http://dx.doi.org/10.2196/13043 |
_version_ | 1783413041653612544 |
---|---|
author | McPadden, Jacob Durant, Thomas JS Bunch, Dustin R Coppi, Andreas Price, Nathaniel Rodgerson, Kris Torre Jr, Charles J Byron, William Hsiao, Allen L Krumholz, Harlan M Schulz, Wade L |
author_facet | McPadden, Jacob Durant, Thomas JS Bunch, Dustin R Coppi, Andreas Price, Nathaniel Rodgerson, Kris Torre Jr, Charles J Byron, William Hsiao, Allen L Krumholz, Harlan M Schulz, Wade L |
author_sort | McPadden, Jacob |
collection | PubMed |
description | BACKGROUND: Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands of health care and designed for scalability and growth. OBJECTIVE: The objectives of our study were to (1) demonstrate the implementation of a data science platform built on open source technology within a large, academic health care system and (2) describe 2 computational health care applications built on such a platform. METHODS: We deployed a data science platform based on several open source technologies to support real-time, big data workloads. We developed data-acquisition workflows for Apache Storm and NiFi in Java and Python to capture patient monitoring and laboratory data for downstream analytics. RESULTS: Emerging data management approaches, along with open source technologies such as Hadoop, can be used to create integrated data lakes to store large, real-time datasets. This infrastructure also provides a robust analytics platform where health care and biomedical research data can be analyzed in near real time for precision medicine and computational health care use cases. CONCLUSIONS: The implementation and use of integrated data science platforms offer organizations the opportunity to combine traditional datasets, including data from the electronic health record, with emerging big data sources, such as continuous patient monitoring and real-time laboratory results. These platforms can enable cost-effective and scalable analytics for the information that will be key to the delivery of precision medicine initiatives. Organizations that can take advantage of the technical advances found in data science platforms will have the opportunity to provide comprehensive access to health care data for computational health care and precision medicine research. |
format | Online Article Text |
id | pubmed-6477571 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-64775712019-05-08 Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform McPadden, Jacob Durant, Thomas JS Bunch, Dustin R Coppi, Andreas Price, Nathaniel Rodgerson, Kris Torre Jr, Charles J Byron, William Hsiao, Allen L Krumholz, Harlan M Schulz, Wade L J Med Internet Res Original Paper BACKGROUND: Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands of health care and designed for scalability and growth. OBJECTIVE: The objectives of our study were to (1) demonstrate the implementation of a data science platform built on open source technology within a large, academic health care system and (2) describe 2 computational health care applications built on such a platform. METHODS: We deployed a data science platform based on several open source technologies to support real-time, big data workloads. We developed data-acquisition workflows for Apache Storm and NiFi in Java and Python to capture patient monitoring and laboratory data for downstream analytics. RESULTS: Emerging data management approaches, along with open source technologies such as Hadoop, can be used to create integrated data lakes to store large, real-time datasets. This infrastructure also provides a robust analytics platform where health care and biomedical research data can be analyzed in near real time for precision medicine and computational health care use cases. CONCLUSIONS: The implementation and use of integrated data science platforms offer organizations the opportunity to combine traditional datasets, including data from the electronic health record, with emerging big data sources, such as continuous patient monitoring and real-time laboratory results. These platforms can enable cost-effective and scalable analytics for the information that will be key to the delivery of precision medicine initiatives. Organizations that can take advantage of the technical advances found in data science platforms will have the opportunity to provide comprehensive access to health care data for computational health care and precision medicine research. JMIR Publications 2019-04-09 /pmc/articles/PMC6477571/ /pubmed/30964441 http://dx.doi.org/10.2196/13043 Text en ©Jacob McPadden, Thomas JS Durant, Dustin R Bunch, Andreas Coppi, Nathaniel Price, Kris Rodgerson, Charles J Torre Jr, William Byron, Allen L Hsiao, Harlan M Krumholz, Wade L Schulz. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 09.04.2019. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper McPadden, Jacob Durant, Thomas JS Bunch, Dustin R Coppi, Andreas Price, Nathaniel Rodgerson, Kris Torre Jr, Charles J Byron, William Hsiao, Allen L Krumholz, Harlan M Schulz, Wade L Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform |
title | Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform |
title_full | Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform |
title_fullStr | Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform |
title_full_unstemmed | Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform |
title_short | Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform |
title_sort | health care and precision medicine research: analysis of a scalable data science platform |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6477571/ https://www.ncbi.nlm.nih.gov/pubmed/30964441 http://dx.doi.org/10.2196/13043 |
work_keys_str_mv | AT mcpaddenjacob healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT durantthomasjs healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT bunchdustinr healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT coppiandreas healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT pricenathaniel healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT rodgersonkris healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT torrejrcharlesj healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT byronwilliam healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT hsiaoallenl healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT krumholzharlanm healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform AT schulzwadel healthcareandprecisionmedicineresearchanalysisofascalabledatascienceplatform |