Cargando…

Next Generation Phenotyping Using the Unified Medical Language System

BACKGROUND: Structured information within patient medical records represents a largely untapped treasure trove of research data. In the United States, privacy issues notwithstanding, this has recently become more accessible thanks to the increasing adoption of electronic health records (EHR) and hea...

Descripción completa

Detalles Bibliográficos
Autores principales: Adamusiak, Tomasz, Shimoyama, Naoki, Shimoyama, Mary
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Gunther Eysenbach 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4288084/
https://www.ncbi.nlm.nih.gov/pubmed/25601137
http://dx.doi.org/10.2196/medinform.3172
_version_ 1782351906089205760
author Adamusiak, Tomasz
Shimoyama, Naoki
Shimoyama, Mary
author_facet Adamusiak, Tomasz
Shimoyama, Naoki
Shimoyama, Mary
author_sort Adamusiak, Tomasz
collection PubMed
description BACKGROUND: Structured information within patient medical records represents a largely untapped treasure trove of research data. In the United States, privacy issues notwithstanding, this has recently become more accessible thanks to the increasing adoption of electronic health records (EHR) and health care data standards fueled by the Meaningful Use legislation. The other side of the coin is that it is now becoming increasingly more difficult to navigate the profusion of many disparate clinical terminology standards, which often span millions of concepts. OBJECTIVE: The objective of our study was to develop a methodology for integrating large amounts of structured clinical information that is both terminology agnostic and able to capture heterogeneous clinical phenotypes including problems, procedures, medications, and clinical results (such as laboratory tests and clinical observations). In this context, we define phenotyping as the extraction of all clinically relevant features contained in the EHR. METHODS: The scope of the project was framed by the Common Meaningful Use (MU) Dataset terminology standards; the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), RxNorm, the Logical Observation Identifiers Names and Codes (LOINC), the Current Procedural Terminology (CPT), the Health care Common Procedure Coding System (HCPCS), the International Classification of Diseases Ninth Revision Clinical Modification (ICD-9-CM), and the International Classification of Diseases Tenth Revision Clinical Modification (ICD-10-CM). The Unified Medical Language System (UMLS) was used as a mapping layer among the MU ontologies. An extract, load, and transform approach separated original annotations in the EHR from the mapping process and allowed for continuous updates as the terminologies were updated. Additionally, we integrated all terminologies into a single UMLS derived ontology and further optimized it to make the relatively large concept graph manageable. RESULTS: The initial evaluation was performed with simulated data from the Clinical Avatars project using 100,000 virtual patients undergoing a 90 day, genotype guided, warfarin dosing protocol. This dataset was annotated with standard MU terminologies, loaded, and transformed using the UMLS. We have deployed this methodology to scale in our in-house analytics platform using structured EHR data for 7931 patients (12 million clinical observations) treated at the Froedtert Hospital. A demonstration limited to Clinical Avatars data is available on the Internet using the credentials user “jmirdemo” and password “jmirdemo”. CONCLUSIONS: Despite its inherent complexity, the UMLS can serve as an effective interface terminology for many of the clinical data standards currently used in the health care domain.
format Online
Article
Text
id pubmed-4288084
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Gunther Eysenbach
record_format MEDLINE/PubMed
spelling pubmed-42880842015-01-15 Next Generation Phenotyping Using the Unified Medical Language System Adamusiak, Tomasz Shimoyama, Naoki Shimoyama, Mary JMIR Med Inform Original Paper BACKGROUND: Structured information within patient medical records represents a largely untapped treasure trove of research data. In the United States, privacy issues notwithstanding, this has recently become more accessible thanks to the increasing adoption of electronic health records (EHR) and health care data standards fueled by the Meaningful Use legislation. The other side of the coin is that it is now becoming increasingly more difficult to navigate the profusion of many disparate clinical terminology standards, which often span millions of concepts. OBJECTIVE: The objective of our study was to develop a methodology for integrating large amounts of structured clinical information that is both terminology agnostic and able to capture heterogeneous clinical phenotypes including problems, procedures, medications, and clinical results (such as laboratory tests and clinical observations). In this context, we define phenotyping as the extraction of all clinically relevant features contained in the EHR. METHODS: The scope of the project was framed by the Common Meaningful Use (MU) Dataset terminology standards; the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), RxNorm, the Logical Observation Identifiers Names and Codes (LOINC), the Current Procedural Terminology (CPT), the Health care Common Procedure Coding System (HCPCS), the International Classification of Diseases Ninth Revision Clinical Modification (ICD-9-CM), and the International Classification of Diseases Tenth Revision Clinical Modification (ICD-10-CM). The Unified Medical Language System (UMLS) was used as a mapping layer among the MU ontologies. An extract, load, and transform approach separated original annotations in the EHR from the mapping process and allowed for continuous updates as the terminologies were updated. Additionally, we integrated all terminologies into a single UMLS derived ontology and further optimized it to make the relatively large concept graph manageable. RESULTS: The initial evaluation was performed with simulated data from the Clinical Avatars project using 100,000 virtual patients undergoing a 90 day, genotype guided, warfarin dosing protocol. This dataset was annotated with standard MU terminologies, loaded, and transformed using the UMLS. We have deployed this methodology to scale in our in-house analytics platform using structured EHR data for 7931 patients (12 million clinical observations) treated at the Froedtert Hospital. A demonstration limited to Clinical Avatars data is available on the Internet using the credentials user “jmirdemo” and password “jmirdemo”. CONCLUSIONS: Despite its inherent complexity, the UMLS can serve as an effective interface terminology for many of the clinical data standards currently used in the health care domain. Gunther Eysenbach 2014-03-18 /pmc/articles/PMC4288084/ /pubmed/25601137 http://dx.doi.org/10.2196/medinform.3172 Text en ©Tomasz Adamusiak, Naoki Shimoyama, Mary Shimoyama. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 18.03.2014. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Adamusiak, Tomasz
Shimoyama, Naoki
Shimoyama, Mary
Next Generation Phenotyping Using the Unified Medical Language System
title Next Generation Phenotyping Using the Unified Medical Language System
title_full Next Generation Phenotyping Using the Unified Medical Language System
title_fullStr Next Generation Phenotyping Using the Unified Medical Language System
title_full_unstemmed Next Generation Phenotyping Using the Unified Medical Language System
title_short Next Generation Phenotyping Using the Unified Medical Language System
title_sort next generation phenotyping using the unified medical language system
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4288084/
https://www.ncbi.nlm.nih.gov/pubmed/25601137
http://dx.doi.org/10.2196/medinform.3172
work_keys_str_mv AT adamusiaktomasz nextgenerationphenotypingusingtheunifiedmedicallanguagesystem
AT shimoyamanaoki nextgenerationphenotypingusingtheunifiedmedicallanguagesystem
AT shimoyamamary nextgenerationphenotypingusingtheunifiedmedicallanguagesystem