Cargando…
A cloud-based pipeline for analysis of FHIR and long-read data
MOTIVATION: As genome sequencing becomes cheaper and more accurate, it is becoming increasingly viable to merge this data with electronic health information to inform clinical decisions. RESULTS: In this work, we demonstrate a full pipeline for working with both PacBio sequencing data and clinical F...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9872570/ https://www.ncbi.nlm.nih.gov/pubmed/36726729 http://dx.doi.org/10.1093/bioadv/vbac095 |
_version_ | 1784877433139232768 |
---|---|
author | Dunn, Tim Cosgun, Erdal |
author_facet | Dunn, Tim Cosgun, Erdal |
author_sort | Dunn, Tim |
collection | PubMed |
description | MOTIVATION: As genome sequencing becomes cheaper and more accurate, it is becoming increasingly viable to merge this data with electronic health information to inform clinical decisions. RESULTS: In this work, we demonstrate a full pipeline for working with both PacBio sequencing data and clinical FHIR(®) data, from initial data to tertiary analysis. The electronic health records are stored in FHIR(®) (Fast Healthcare Interoperability Resource) format, the current leading standard for healthcare data exchange. For the genomic data, we perform variant calling on long-read PacBio HiFi data using Cromwell on Azure. Both data formats are parsed, processed and merged in a single scalable pipeline which securely performs tertiary analyses using cloud-based Jupyter notebooks. We include three example applications: exporting patient information to a database, clustering patients and performing a simple pharmacogenomic study. AVAILABILITY AND IMPLEMENTATION: https://github.com/microsoft/genomicsnotebook/tree/main/fhirgenomics SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. |
format | Online Article Text |
id | pubmed-9872570 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98725702023-01-31 A cloud-based pipeline for analysis of FHIR and long-read data Dunn, Tim Cosgun, Erdal Bioinform Adv Original Paper MOTIVATION: As genome sequencing becomes cheaper and more accurate, it is becoming increasingly viable to merge this data with electronic health information to inform clinical decisions. RESULTS: In this work, we demonstrate a full pipeline for working with both PacBio sequencing data and clinical FHIR(®) data, from initial data to tertiary analysis. The electronic health records are stored in FHIR(®) (Fast Healthcare Interoperability Resource) format, the current leading standard for healthcare data exchange. For the genomic data, we perform variant calling on long-read PacBio HiFi data using Cromwell on Azure. Both data formats are parsed, processed and merged in a single scalable pipeline which securely performs tertiary analyses using cloud-based Jupyter notebooks. We include three example applications: exporting patient information to a database, clustering patients and performing a simple pharmacogenomic study. AVAILABILITY AND IMPLEMENTATION: https://github.com/microsoft/genomicsnotebook/tree/main/fhirgenomics SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-01-20 /pmc/articles/PMC9872570/ /pubmed/36726729 http://dx.doi.org/10.1093/bioadv/vbac095 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Dunn, Tim Cosgun, Erdal A cloud-based pipeline for analysis of FHIR and long-read data |
title | A cloud-based pipeline for analysis of FHIR and long-read data |
title_full | A cloud-based pipeline for analysis of FHIR and long-read data |
title_fullStr | A cloud-based pipeline for analysis of FHIR and long-read data |
title_full_unstemmed | A cloud-based pipeline for analysis of FHIR and long-read data |
title_short | A cloud-based pipeline for analysis of FHIR and long-read data |
title_sort | cloud-based pipeline for analysis of fhir and long-read data |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9872570/ https://www.ncbi.nlm.nih.gov/pubmed/36726729 http://dx.doi.org/10.1093/bioadv/vbac095 |
work_keys_str_mv | AT dunntim acloudbasedpipelineforanalysisoffhirandlongreaddata AT cosgunerdal acloudbasedpipelineforanalysisoffhirandlongreaddata AT dunntim cloudbasedpipelineforanalysisoffhirandlongreaddata AT cosgunerdal cloudbasedpipelineforanalysisoffhirandlongreaddata |