Cargando…
Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model
BACKGROUND: The All Of Us Research Program (AOU) is building a nationwide cohort of one million patients’ EHR and genomic data. Data interoperability is paramount to the program’s success. AOU is standardizing its EHR data around the Observational Medical Outcomes Partnership (OMOP) data model. OMOP...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6380544/ https://www.ncbi.nlm.nih.gov/pubmed/30779778 http://dx.doi.org/10.1371/journal.pone.0212463 |
_version_ | 1783396313562349568 |
---|---|
author | Klann, Jeffrey G. Joss, Matthew A. H. Embree, Kevin Murphy, Shawn N. |
author_facet | Klann, Jeffrey G. Joss, Matthew A. H. Embree, Kevin Murphy, Shawn N. |
author_sort | Klann, Jeffrey G. |
collection | PubMed |
description | BACKGROUND: The All Of Us Research Program (AOU) is building a nationwide cohort of one million patients’ EHR and genomic data. Data interoperability is paramount to the program’s success. AOU is standardizing its EHR data around the Observational Medical Outcomes Partnership (OMOP) data model. OMOP is one of several standard data models presently used in national-scale initiatives. Each model is unique enough to make interoperability difficult. The i2b2 data warehousing and analytics platform is used at over 200 sites worldwide, which uses a flexible ontology-driven approach for data storage. We previously demonstrated this ontology system can drive data reconfiguration, to transform data into new formats without site-specific programming. We previously implemented this on our 12-site Accessible Research Commons for Health (ARCH) network to transform i2b2 into the Patient Centered Outcomes Research Network model. METHODS AND RESULTS: Here, we leverage our investment in i2b2 high-performance transformations to support the AOU OMOP data pipeline. Because the ARCH ontology has gained widespread national interest (through the Accrual to Clinical Trials network, other PCORnet networks, and the Nebraska Lexicon), we leveraged sites’ existing investments into this standard ontology. We developed an i2b2-to-OMOP transformation, driven by the ARCH-OMOP ontology and the OMOP concept mapping dictionary. We demonstrated and validated our approach in the AOU New England HPO (NEHPO). First, we transformed into OMOP a fake patient dataset in i2b2 and verified through AOU tools that the data was structurally compliant with OMOP. We then transformed a subset of data in the Partners Healthcare data warehouse into OMOP. We developed a checklist of assessments to ensure the transformed data had self-integrity (e.g., the distributions have an expected shape and required fields are populated), using OMOP’s visual Achilles data quality tool. This i2b2-to-OMOP transformation is being used to send NEHPO production data to AOU. It is open-source and ready for use by other research projects. |
format | Online Article Text |
id | pubmed-6380544 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-63805442019-03-01 Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model Klann, Jeffrey G. Joss, Matthew A. H. Embree, Kevin Murphy, Shawn N. PLoS One Research Article BACKGROUND: The All Of Us Research Program (AOU) is building a nationwide cohort of one million patients’ EHR and genomic data. Data interoperability is paramount to the program’s success. AOU is standardizing its EHR data around the Observational Medical Outcomes Partnership (OMOP) data model. OMOP is one of several standard data models presently used in national-scale initiatives. Each model is unique enough to make interoperability difficult. The i2b2 data warehousing and analytics platform is used at over 200 sites worldwide, which uses a flexible ontology-driven approach for data storage. We previously demonstrated this ontology system can drive data reconfiguration, to transform data into new formats without site-specific programming. We previously implemented this on our 12-site Accessible Research Commons for Health (ARCH) network to transform i2b2 into the Patient Centered Outcomes Research Network model. METHODS AND RESULTS: Here, we leverage our investment in i2b2 high-performance transformations to support the AOU OMOP data pipeline. Because the ARCH ontology has gained widespread national interest (through the Accrual to Clinical Trials network, other PCORnet networks, and the Nebraska Lexicon), we leveraged sites’ existing investments into this standard ontology. We developed an i2b2-to-OMOP transformation, driven by the ARCH-OMOP ontology and the OMOP concept mapping dictionary. We demonstrated and validated our approach in the AOU New England HPO (NEHPO). First, we transformed into OMOP a fake patient dataset in i2b2 and verified through AOU tools that the data was structurally compliant with OMOP. We then transformed a subset of data in the Partners Healthcare data warehouse into OMOP. We developed a checklist of assessments to ensure the transformed data had self-integrity (e.g., the distributions have an expected shape and required fields are populated), using OMOP’s visual Achilles data quality tool. This i2b2-to-OMOP transformation is being used to send NEHPO production data to AOU. It is open-source and ready for use by other research projects. Public Library of Science 2019-02-19 /pmc/articles/PMC6380544/ /pubmed/30779778 http://dx.doi.org/10.1371/journal.pone.0212463 Text en © 2019 Klann et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Klann, Jeffrey G. Joss, Matthew A. H. Embree, Kevin Murphy, Shawn N. Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model |
title | Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model |
title_full | Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model |
title_fullStr | Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model |
title_full_unstemmed | Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model |
title_short | Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model |
title_sort | data model harmonization for the all of us research program: transforming i2b2 data into the omop common data model |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6380544/ https://www.ncbi.nlm.nih.gov/pubmed/30779778 http://dx.doi.org/10.1371/journal.pone.0212463 |
work_keys_str_mv | AT klannjeffreyg datamodelharmonizationfortheallofusresearchprogramtransformingi2b2dataintotheomopcommondatamodel AT jossmatthewah datamodelharmonizationfortheallofusresearchprogramtransformingi2b2dataintotheomopcommondatamodel AT embreekevin datamodelharmonizationfortheallofusresearchprogramtransformingi2b2dataintotheomopcommondatamodel AT murphyshawnn datamodelharmonizationfortheallofusresearchprogramtransformingi2b2dataintotheomopcommondatamodel |