Cargando…

Graph-based signal integration for high-throughput phenotyping

BACKGROUND: Electronic Health Records aggregated in Clinical Data Warehouses (CDWs) promise to revolutionize Comparative Effectiveness Research and suggest new avenues of research. However, the effectiveness of CDWs is diminished by the lack of properly labeled data. We present a novel approach that...

Descripción completa

Detalles Bibliográficos
Autores principales: Herskovic, Jorge R, Subramanian, Devika, Cohen, Trevor, Bozzo-Silva, Pamela A, Bearden, Charles F, Bernstam, Elmer V
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426800/
https://www.ncbi.nlm.nih.gov/pubmed/23320851
http://dx.doi.org/10.1186/1471-2105-13-S13-S2
_version_ 1782241545107275776
author Herskovic, Jorge R
Subramanian, Devika
Cohen, Trevor
Bozzo-Silva, Pamela A
Bearden, Charles F
Bernstam, Elmer V
author_facet Herskovic, Jorge R
Subramanian, Devika
Cohen, Trevor
Bozzo-Silva, Pamela A
Bearden, Charles F
Bernstam, Elmer V
author_sort Herskovic, Jorge R
collection PubMed
description BACKGROUND: Electronic Health Records aggregated in Clinical Data Warehouses (CDWs) promise to revolutionize Comparative Effectiveness Research and suggest new avenues of research. However, the effectiveness of CDWs is diminished by the lack of properly labeled data. We present a novel approach that integrates knowledge from the CDW, the biomedical literature, and the Unified Medical Language System (UMLS) to perform high-throughput phenotyping. In this paper, we automatically construct a graphical knowledge model and then use it to phenotype breast cancer patients. We compare the performance of this approach to using MetaMap when labeling records. RESULTS: MetaMap's overall accuracy at identifying breast cancer patients was 51.1% (n=428); recall=85.4%, precision=26.2%, and F(1)=40.1%. Our unsupervised graph-based high-throughput phenotyping had accuracy of 84.1%; recall=46.3%, precision=61.2%, and F(1)=52.8%. CONCLUSIONS: We conclude that our approach is a promising alternative for unsupervised high-throughput phenotyping.
format Online
Article
Text
id pubmed-3426800
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34268002012-08-24 Graph-based signal integration for high-throughput phenotyping Herskovic, Jorge R Subramanian, Devika Cohen, Trevor Bozzo-Silva, Pamela A Bearden, Charles F Bernstam, Elmer V BMC Bioinformatics Research BACKGROUND: Electronic Health Records aggregated in Clinical Data Warehouses (CDWs) promise to revolutionize Comparative Effectiveness Research and suggest new avenues of research. However, the effectiveness of CDWs is diminished by the lack of properly labeled data. We present a novel approach that integrates knowledge from the CDW, the biomedical literature, and the Unified Medical Language System (UMLS) to perform high-throughput phenotyping. In this paper, we automatically construct a graphical knowledge model and then use it to phenotype breast cancer patients. We compare the performance of this approach to using MetaMap when labeling records. RESULTS: MetaMap's overall accuracy at identifying breast cancer patients was 51.1% (n=428); recall=85.4%, precision=26.2%, and F(1)=40.1%. Our unsupervised graph-based high-throughput phenotyping had accuracy of 84.1%; recall=46.3%, precision=61.2%, and F(1)=52.8%. CONCLUSIONS: We conclude that our approach is a promising alternative for unsupervised high-throughput phenotyping. BioMed Central 2012-08-24 /pmc/articles/PMC3426800/ /pubmed/23320851 http://dx.doi.org/10.1186/1471-2105-13-S13-S2 Text en Copyright ©2012 Herskovic et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Herskovic, Jorge R
Subramanian, Devika
Cohen, Trevor
Bozzo-Silva, Pamela A
Bearden, Charles F
Bernstam, Elmer V
Graph-based signal integration for high-throughput phenotyping
title Graph-based signal integration for high-throughput phenotyping
title_full Graph-based signal integration for high-throughput phenotyping
title_fullStr Graph-based signal integration for high-throughput phenotyping
title_full_unstemmed Graph-based signal integration for high-throughput phenotyping
title_short Graph-based signal integration for high-throughput phenotyping
title_sort graph-based signal integration for high-throughput phenotyping
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426800/
https://www.ncbi.nlm.nih.gov/pubmed/23320851
http://dx.doi.org/10.1186/1471-2105-13-S13-S2
work_keys_str_mv AT herskovicjorger graphbasedsignalintegrationforhighthroughputphenotyping
AT subramaniandevika graphbasedsignalintegrationforhighthroughputphenotyping
AT cohentrevor graphbasedsignalintegrationforhighthroughputphenotyping
AT bozzosilvapamelaa graphbasedsignalintegrationforhighthroughputphenotyping
AT beardencharlesf graphbasedsignalintegrationforhighthroughputphenotyping
AT bernstamelmerv graphbasedsignalintegrationforhighthroughputphenotyping