Cargando…
Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes
Enabling discovery across the spectrum of rare and common diseases requires the integration of biological knowledge with clinical data; however, differences in terminologies present a major barrier. For example, the Human Phenotype Ontology (HPO) is the primary vocabulary for describing features of...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9976874/ https://www.ncbi.nlm.nih.gov/pubmed/36875690 http://dx.doi.org/10.1093/jamiaopen/ooad007 |
_version_ | 1784899170040020992 |
---|---|
author | McArthur, Evonne Bastarache, Lisa Capra, John A |
author_facet | McArthur, Evonne Bastarache, Lisa Capra, John A |
author_sort | McArthur, Evonne |
collection | PubMed |
description | Enabling discovery across the spectrum of rare and common diseases requires the integration of biological knowledge with clinical data; however, differences in terminologies present a major barrier. For example, the Human Phenotype Ontology (HPO) is the primary vocabulary for describing features of rare diseases, while most clinical encounters use International Classification of Diseases (ICD) billing codes. ICD codes are further organized into clinically meaningful phenotypes via phecodes. Despite their prevalence, no robust phenome-wide disease mapping between HPO and phecodes/ICD exists. Here, we synthesize evidence using diverse sources and methods—including text matching, the National Library of Medicine’s Unified Medical Language System (UMLS), Wikipedia, SORTA, and PheMap—to define a mapping between phecodes and HPO terms via 38 950 links. We evaluate the precision and recall for each domain of evidence, both individually and jointly. This flexibility permits users to tailor the HPO–phecode links for diverse applications along the spectrum of monogenic to polygenic diseases. |
format | Online Article Text |
id | pubmed-9976874 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-99768742023-03-02 Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes McArthur, Evonne Bastarache, Lisa Capra, John A JAMIA Open Brief Communications Enabling discovery across the spectrum of rare and common diseases requires the integration of biological knowledge with clinical data; however, differences in terminologies present a major barrier. For example, the Human Phenotype Ontology (HPO) is the primary vocabulary for describing features of rare diseases, while most clinical encounters use International Classification of Diseases (ICD) billing codes. ICD codes are further organized into clinically meaningful phenotypes via phecodes. Despite their prevalence, no robust phenome-wide disease mapping between HPO and phecodes/ICD exists. Here, we synthesize evidence using diverse sources and methods—including text matching, the National Library of Medicine’s Unified Medical Language System (UMLS), Wikipedia, SORTA, and PheMap—to define a mapping between phecodes and HPO terms via 38 950 links. We evaluate the precision and recall for each domain of evidence, both individually and jointly. This flexibility permits users to tailor the HPO–phecode links for diverse applications along the spectrum of monogenic to polygenic diseases. Oxford University Press 2023-02-28 /pmc/articles/PMC9976874/ /pubmed/36875690 http://dx.doi.org/10.1093/jamiaopen/ooad007 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Brief Communications McArthur, Evonne Bastarache, Lisa Capra, John A Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes |
title | Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes |
title_full | Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes |
title_fullStr | Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes |
title_full_unstemmed | Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes |
title_short | Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes |
title_sort | linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes |
topic | Brief Communications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9976874/ https://www.ncbi.nlm.nih.gov/pubmed/36875690 http://dx.doi.org/10.1093/jamiaopen/ooad007 |
work_keys_str_mv | AT mcarthurevonne linkingrareandcommondiseasevocabulariesbymappingbetweenthehumanphenotypeontologyandphecodes AT bastarachelisa linkingrareandcommondiseasevocabulariesbymappingbetweenthehumanphenotypeontologyandphecodes AT caprajohna linkingrareandcommondiseasevocabulariesbymappingbetweenthehumanphenotypeontologyandphecodes |