Cargando…

Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study

BACKGROUND: Clinical sequencing data should be shared in order to achieve the sufficient scale and diversity required to provide strong evidence for improving patient care. A distributed research network allows researchers to share this evidence rather than the patient-level data across centers, the...

Descripción completa

Detalles Bibliográficos
Autores principales: Shin, Seo Jeong, You, Seng Chan, Park, Yu Rang, Roh, Jin, Kim, Jang-Hee, Haam, Seokjin, Reich, Christian G, Blacketer, Clair, Son, Dae-Soon, Oh, Seungbin, Park, Rae Woong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6454347/
https://www.ncbi.nlm.nih.gov/pubmed/30912749
http://dx.doi.org/10.2196/13249
_version_ 1783409557801795584
author Shin, Seo Jeong
You, Seng Chan
Park, Yu Rang
Roh, Jin
Kim, Jang-Hee
Haam, Seokjin
Reich, Christian G
Blacketer, Clair
Son, Dae-Soon
Oh, Seungbin
Park, Rae Woong
author_facet Shin, Seo Jeong
You, Seng Chan
Park, Yu Rang
Roh, Jin
Kim, Jang-Hee
Haam, Seokjin
Reich, Christian G
Blacketer, Clair
Son, Dae-Soon
Oh, Seungbin
Park, Rae Woong
author_sort Shin, Seo Jeong
collection PubMed
description BACKGROUND: Clinical sequencing data should be shared in order to achieve the sufficient scale and diversity required to provide strong evidence for improving patient care. A distributed research network allows researchers to share this evidence rather than the patient-level data across centers, thereby avoiding privacy issues. The Observational Medical Outcomes Partnership (OMOP) common data model (CDM) used in distributed research networks has low coverage of sequencing data and does not reflect the latest trends of precision medicine. OBJECTIVE: The aim of this study was to develop and evaluate the feasibility of a genomic CDM (G-CDM), as an extension of the OMOP-CDM, for application of genomic data in clinical practice. METHODS: Existing genomic data models and sequencing reports were reviewed to extend the OMOP-CDM to cover genomic data. The Human Genome Organisation Gene Nomenclature Committee and Human Genome Variation Society nomenclature were adopted to standardize the terminology in the model. Sequencing data of 114 and 1060 patients with lung cancer were obtained from the Ajou University School of Medicine database of Ajou University Hospital and The Cancer Genome Atlas, respectively, which were transformed to a format appropriate for the G-CDM. The data were compared with respect to gene name, variant type, and actionable mutations. RESULTS: The G-CDM was extended into four tables linked to tables of the OMOP-CDM. Upon comparison with The Cancer Genome Atlas data, a clinically actionable mutation, p.Leu858Arg, in the EGFR gene was 6.64 times more frequent in the Ajou University School of Medicine database, while the p.Gly12Xaa mutation in the KRAS gene was 2.02 times more frequent in The Cancer Genome Atlas dataset. The data-exploring tool GeneProfiler was further developed to conduct descriptive analyses automatically using the G-CDM, which provides the proportions of genes, variant types, and actionable mutations. GeneProfiler also allows for querying the specific gene name and Human Genome Variation Society nomenclature to calculate the proportion of patients with a given mutation. CONCLUSIONS: We developed the G-CDM for effective integration of genomic data with standardized clinical data, allowing for data sharing across institutes. The feasibility of the G-CDM was validated by assessing the differences in data characteristics between two different genomic databases through the proposed data-exploring tool GeneProfiler. The G-CDM may facilitate analyses of interoperating clinical and genomic datasets across multiple institutions, minimizing privacy issues and enabling researchers to better understand the characteristics of patients and promote personalized medicine in clinical practice.
format Online
Article
Text
id pubmed-6454347
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-64543472019-04-26 Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study Shin, Seo Jeong You, Seng Chan Park, Yu Rang Roh, Jin Kim, Jang-Hee Haam, Seokjin Reich, Christian G Blacketer, Clair Son, Dae-Soon Oh, Seungbin Park, Rae Woong J Med Internet Res Original Paper BACKGROUND: Clinical sequencing data should be shared in order to achieve the sufficient scale and diversity required to provide strong evidence for improving patient care. A distributed research network allows researchers to share this evidence rather than the patient-level data across centers, thereby avoiding privacy issues. The Observational Medical Outcomes Partnership (OMOP) common data model (CDM) used in distributed research networks has low coverage of sequencing data and does not reflect the latest trends of precision medicine. OBJECTIVE: The aim of this study was to develop and evaluate the feasibility of a genomic CDM (G-CDM), as an extension of the OMOP-CDM, for application of genomic data in clinical practice. METHODS: Existing genomic data models and sequencing reports were reviewed to extend the OMOP-CDM to cover genomic data. The Human Genome Organisation Gene Nomenclature Committee and Human Genome Variation Society nomenclature were adopted to standardize the terminology in the model. Sequencing data of 114 and 1060 patients with lung cancer were obtained from the Ajou University School of Medicine database of Ajou University Hospital and The Cancer Genome Atlas, respectively, which were transformed to a format appropriate for the G-CDM. The data were compared with respect to gene name, variant type, and actionable mutations. RESULTS: The G-CDM was extended into four tables linked to tables of the OMOP-CDM. Upon comparison with The Cancer Genome Atlas data, a clinically actionable mutation, p.Leu858Arg, in the EGFR gene was 6.64 times more frequent in the Ajou University School of Medicine database, while the p.Gly12Xaa mutation in the KRAS gene was 2.02 times more frequent in The Cancer Genome Atlas dataset. The data-exploring tool GeneProfiler was further developed to conduct descriptive analyses automatically using the G-CDM, which provides the proportions of genes, variant types, and actionable mutations. GeneProfiler also allows for querying the specific gene name and Human Genome Variation Society nomenclature to calculate the proportion of patients with a given mutation. CONCLUSIONS: We developed the G-CDM for effective integration of genomic data with standardized clinical data, allowing for data sharing across institutes. The feasibility of the G-CDM was validated by assessing the differences in data characteristics between two different genomic databases through the proposed data-exploring tool GeneProfiler. The G-CDM may facilitate analyses of interoperating clinical and genomic datasets across multiple institutions, minimizing privacy issues and enabling researchers to better understand the characteristics of patients and promote personalized medicine in clinical practice. JMIR Publications 2019-03-26 /pmc/articles/PMC6454347/ /pubmed/30912749 http://dx.doi.org/10.2196/13249 Text en ©Seo Jeong Shin, Seng Chan You, Yu Rang Park, Jin Roh, Jang-Hee Kim, Seokjin Haam, Christian G Reich, Clair Blacketer, Dae-Soon Son, Seungbin Oh, Rae Woong Park. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 26.03.2019. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Shin, Seo Jeong
You, Seng Chan
Park, Yu Rang
Roh, Jin
Kim, Jang-Hee
Haam, Seokjin
Reich, Christian G
Blacketer, Clair
Son, Dae-Soon
Oh, Seungbin
Park, Rae Woong
Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study
title Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study
title_full Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study
title_fullStr Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study
title_full_unstemmed Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study
title_short Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study
title_sort genomic common data model for seamless interoperation of biomedical data in clinical practice: retrospective study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6454347/
https://www.ncbi.nlm.nih.gov/pubmed/30912749
http://dx.doi.org/10.2196/13249
work_keys_str_mv AT shinseojeong genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT yousengchan genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT parkyurang genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT rohjin genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT kimjanghee genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT haamseokjin genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT reichchristiang genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT blacketerclair genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT sondaesoon genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT ohseungbin genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy
AT parkraewoong genomiccommondatamodelforseamlessinteroperationofbiomedicaldatainclinicalpracticeretrospectivestudy