Cargando…
Next Generation Phenotyping Using the Unified Medical Language System
BACKGROUND: Structured information within patient medical records represents a largely untapped treasure trove of research data. In the United States, privacy issues notwithstanding, this has recently become more accessible thanks to the increasing adoption of electronic health records (EHR) and hea...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Gunther Eysenbach
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4288084/ https://www.ncbi.nlm.nih.gov/pubmed/25601137 http://dx.doi.org/10.2196/medinform.3172 |
_version_ | 1782351906089205760 |
---|---|
author | Adamusiak, Tomasz Shimoyama, Naoki Shimoyama, Mary |
author_facet | Adamusiak, Tomasz Shimoyama, Naoki Shimoyama, Mary |
author_sort | Adamusiak, Tomasz |
collection | PubMed |
description | BACKGROUND: Structured information within patient medical records represents a largely untapped treasure trove of research data. In the United States, privacy issues notwithstanding, this has recently become more accessible thanks to the increasing adoption of electronic health records (EHR) and health care data standards fueled by the Meaningful Use legislation. The other side of the coin is that it is now becoming increasingly more difficult to navigate the profusion of many disparate clinical terminology standards, which often span millions of concepts. OBJECTIVE: The objective of our study was to develop a methodology for integrating large amounts of structured clinical information that is both terminology agnostic and able to capture heterogeneous clinical phenotypes including problems, procedures, medications, and clinical results (such as laboratory tests and clinical observations). In this context, we define phenotyping as the extraction of all clinically relevant features contained in the EHR. METHODS: The scope of the project was framed by the Common Meaningful Use (MU) Dataset terminology standards; the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), RxNorm, the Logical Observation Identifiers Names and Codes (LOINC), the Current Procedural Terminology (CPT), the Health care Common Procedure Coding System (HCPCS), the International Classification of Diseases Ninth Revision Clinical Modification (ICD-9-CM), and the International Classification of Diseases Tenth Revision Clinical Modification (ICD-10-CM). The Unified Medical Language System (UMLS) was used as a mapping layer among the MU ontologies. An extract, load, and transform approach separated original annotations in the EHR from the mapping process and allowed for continuous updates as the terminologies were updated. Additionally, we integrated all terminologies into a single UMLS derived ontology and further optimized it to make the relatively large concept graph manageable. RESULTS: The initial evaluation was performed with simulated data from the Clinical Avatars project using 100,000 virtual patients undergoing a 90 day, genotype guided, warfarin dosing protocol. This dataset was annotated with standard MU terminologies, loaded, and transformed using the UMLS. We have deployed this methodology to scale in our in-house analytics platform using structured EHR data for 7931 patients (12 million clinical observations) treated at the Froedtert Hospital. A demonstration limited to Clinical Avatars data is available on the Internet using the credentials user “jmirdemo” and password “jmirdemo”. CONCLUSIONS: Despite its inherent complexity, the UMLS can serve as an effective interface terminology for many of the clinical data standards currently used in the health care domain. |
format | Online Article Text |
id | pubmed-4288084 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Gunther Eysenbach |
record_format | MEDLINE/PubMed |
spelling | pubmed-42880842015-01-15 Next Generation Phenotyping Using the Unified Medical Language System Adamusiak, Tomasz Shimoyama, Naoki Shimoyama, Mary JMIR Med Inform Original Paper BACKGROUND: Structured information within patient medical records represents a largely untapped treasure trove of research data. In the United States, privacy issues notwithstanding, this has recently become more accessible thanks to the increasing adoption of electronic health records (EHR) and health care data standards fueled by the Meaningful Use legislation. The other side of the coin is that it is now becoming increasingly more difficult to navigate the profusion of many disparate clinical terminology standards, which often span millions of concepts. OBJECTIVE: The objective of our study was to develop a methodology for integrating large amounts of structured clinical information that is both terminology agnostic and able to capture heterogeneous clinical phenotypes including problems, procedures, medications, and clinical results (such as laboratory tests and clinical observations). In this context, we define phenotyping as the extraction of all clinically relevant features contained in the EHR. METHODS: The scope of the project was framed by the Common Meaningful Use (MU) Dataset terminology standards; the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), RxNorm, the Logical Observation Identifiers Names and Codes (LOINC), the Current Procedural Terminology (CPT), the Health care Common Procedure Coding System (HCPCS), the International Classification of Diseases Ninth Revision Clinical Modification (ICD-9-CM), and the International Classification of Diseases Tenth Revision Clinical Modification (ICD-10-CM). The Unified Medical Language System (UMLS) was used as a mapping layer among the MU ontologies. An extract, load, and transform approach separated original annotations in the EHR from the mapping process and allowed for continuous updates as the terminologies were updated. Additionally, we integrated all terminologies into a single UMLS derived ontology and further optimized it to make the relatively large concept graph manageable. RESULTS: The initial evaluation was performed with simulated data from the Clinical Avatars project using 100,000 virtual patients undergoing a 90 day, genotype guided, warfarin dosing protocol. This dataset was annotated with standard MU terminologies, loaded, and transformed using the UMLS. We have deployed this methodology to scale in our in-house analytics platform using structured EHR data for 7931 patients (12 million clinical observations) treated at the Froedtert Hospital. A demonstration limited to Clinical Avatars data is available on the Internet using the credentials user “jmirdemo” and password “jmirdemo”. CONCLUSIONS: Despite its inherent complexity, the UMLS can serve as an effective interface terminology for many of the clinical data standards currently used in the health care domain. Gunther Eysenbach 2014-03-18 /pmc/articles/PMC4288084/ /pubmed/25601137 http://dx.doi.org/10.2196/medinform.3172 Text en ©Tomasz Adamusiak, Naoki Shimoyama, Mary Shimoyama. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 18.03.2014. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Adamusiak, Tomasz Shimoyama, Naoki Shimoyama, Mary Next Generation Phenotyping Using the Unified Medical Language System |
title | Next Generation Phenotyping Using the Unified Medical Language System |
title_full | Next Generation Phenotyping Using the Unified Medical Language System |
title_fullStr | Next Generation Phenotyping Using the Unified Medical Language System |
title_full_unstemmed | Next Generation Phenotyping Using the Unified Medical Language System |
title_short | Next Generation Phenotyping Using the Unified Medical Language System |
title_sort | next generation phenotyping using the unified medical language system |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4288084/ https://www.ncbi.nlm.nih.gov/pubmed/25601137 http://dx.doi.org/10.2196/medinform.3172 |
work_keys_str_mv | AT adamusiaktomasz nextgenerationphenotypingusingtheunifiedmedicallanguagesystem AT shimoyamanaoki nextgenerationphenotypingusingtheunifiedmedicallanguagesystem AT shimoyamamary nextgenerationphenotypingusingtheunifiedmedicallanguagesystem |