Cargando…

PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records

OBJECTIVE: Developing algorithms to extract phenotypes from electronic health records (EHRs) can be challenging and time-consuming. We developed PheMap, a high-throughput phenotyping approach that leverages multiple independent, online resources to streamline the phenotyping process within EHRs. MAT...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Neil S, Feng, QiPing, Kerchberger, V Eric, Zhao, Juan, Edwards, Todd L, Cox, Nancy J, Stein, C Michael, Roden, Dan M, Denny, Joshua C, Wei, Wei-Qi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7751140/
https://www.ncbi.nlm.nih.gov/pubmed/32974638
http://dx.doi.org/10.1093/jamia/ocaa104
_version_ 1783625613092847616
author Zheng, Neil S
Feng, QiPing
Kerchberger, V Eric
Zhao, Juan
Edwards, Todd L
Cox, Nancy J
Stein, C Michael
Roden, Dan M
Denny, Joshua C
Wei, Wei-Qi
author_facet Zheng, Neil S
Feng, QiPing
Kerchberger, V Eric
Zhao, Juan
Edwards, Todd L
Cox, Nancy J
Stein, C Michael
Roden, Dan M
Denny, Joshua C
Wei, Wei-Qi
author_sort Zheng, Neil S
collection PubMed
description OBJECTIVE: Developing algorithms to extract phenotypes from electronic health records (EHRs) can be challenging and time-consuming. We developed PheMap, a high-throughput phenotyping approach that leverages multiple independent, online resources to streamline the phenotyping process within EHRs. MATERIALS AND METHODS: PheMap is a knowledge base of medical concepts with quantified relationships to phenotypes that have been extracted by natural language processing from publicly available resources. PheMap searches EHRs for each phenotype’s quantified concepts and uses them to calculate an individual’s probability of having this phenotype. We compared PheMap to clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network for type 2 diabetes mellitus (T2DM), dementia, and hypothyroidism using 84 821 individuals from Vanderbilt Univeresity Medical Center's BioVU DNA Biobank. We implemented PheMap-based phenotypes for genome-wide association studies (GWAS) for T2DM, dementia, and hypothyroidism, and phenome-wide association studies (PheWAS) for variants in FTO, HLA-DRB1, and TCF7L2. RESULTS: In this initial iteration, the PheMap knowledge base contains quantified concepts for 841 disease phenotypes. For T2DM, dementia, and hypothyroidism, the accuracy of the PheMap phenotypes were >97% using a 50% threshold and eMERGE case-control status as a reference standard. In the GWAS analyses, PheMap-derived phenotype probabilities replicated 43 of 51 previously reported disease-associated variants for the 3 phenotypes. For 9 of the 11 top associations, PheMap provided an equivalent or more significant P value than eMERGE-based phenotypes. The PheMap-based PheWAS showed comparable or better performance to a traditional phecode-based PheWAS. PheMap is publicly available online. CONCLUSIONS: PheMap significantly streamlines the process of extracting research-quality phenotype information from EHRs, with comparable or better performance to current phenotyping approaches.
format Online
Article
Text
id pubmed-7751140
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77511402020-12-28 PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records Zheng, Neil S Feng, QiPing Kerchberger, V Eric Zhao, Juan Edwards, Todd L Cox, Nancy J Stein, C Michael Roden, Dan M Denny, Joshua C Wei, Wei-Qi J Am Med Inform Assoc Research and Applications OBJECTIVE: Developing algorithms to extract phenotypes from electronic health records (EHRs) can be challenging and time-consuming. We developed PheMap, a high-throughput phenotyping approach that leverages multiple independent, online resources to streamline the phenotyping process within EHRs. MATERIALS AND METHODS: PheMap is a knowledge base of medical concepts with quantified relationships to phenotypes that have been extracted by natural language processing from publicly available resources. PheMap searches EHRs for each phenotype’s quantified concepts and uses them to calculate an individual’s probability of having this phenotype. We compared PheMap to clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network for type 2 diabetes mellitus (T2DM), dementia, and hypothyroidism using 84 821 individuals from Vanderbilt Univeresity Medical Center's BioVU DNA Biobank. We implemented PheMap-based phenotypes for genome-wide association studies (GWAS) for T2DM, dementia, and hypothyroidism, and phenome-wide association studies (PheWAS) for variants in FTO, HLA-DRB1, and TCF7L2. RESULTS: In this initial iteration, the PheMap knowledge base contains quantified concepts for 841 disease phenotypes. For T2DM, dementia, and hypothyroidism, the accuracy of the PheMap phenotypes were >97% using a 50% threshold and eMERGE case-control status as a reference standard. In the GWAS analyses, PheMap-derived phenotype probabilities replicated 43 of 51 previously reported disease-associated variants for the 3 phenotypes. For 9 of the 11 top associations, PheMap provided an equivalent or more significant P value than eMERGE-based phenotypes. The PheMap-based PheWAS showed comparable or better performance to a traditional phecode-based PheWAS. PheMap is publicly available online. CONCLUSIONS: PheMap significantly streamlines the process of extracting research-quality phenotype information from EHRs, with comparable or better performance to current phenotyping approaches. Oxford University Press 2020-09-24 /pmc/articles/PMC7751140/ /pubmed/32974638 http://dx.doi.org/10.1093/jamia/ocaa104 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Zheng, Neil S
Feng, QiPing
Kerchberger, V Eric
Zhao, Juan
Edwards, Todd L
Cox, Nancy J
Stein, C Michael
Roden, Dan M
Denny, Joshua C
Wei, Wei-Qi
PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records
title PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records
title_full PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records
title_fullStr PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records
title_full_unstemmed PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records
title_short PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records
title_sort phemap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7751140/
https://www.ncbi.nlm.nih.gov/pubmed/32974638
http://dx.doi.org/10.1093/jamia/ocaa104
work_keys_str_mv AT zhengneils phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT fengqiping phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT kerchbergerveric phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT zhaojuan phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT edwardstoddl phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT coxnancyj phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT steincmichael phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT rodendanm phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT dennyjoshuac phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords
AT weiweiqi phemapamultiresourceknowledgebaseforhighthroughputphenotypingwithinelectronichealthrecords