Cargando…

CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph

Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the Human Phenotype Ontology (HPO), which are increasingly used in electronic health records (EHRs), too. Many algori...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Chengyao, Dieck, Simon, Schmid, Alexander, Ahmad, Ashar, Knaus, Alexej, Wenzel, Maren, Mehnert, Laura, Zirn, Birgit, Haack, Tobias, Ossowski, Stephan, Wagner, Matias, Brunet, Theresa, Ehmke, Nadja, Danyel, Magdalena, Rosnev, Stanislav, Kamphans, Tom, Nadav, Guy, Fleischer, Nicole, Fröhlich, Holger, Krawitz, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415429/
https://www.ncbi.nlm.nih.gov/pubmed/34514393
http://dx.doi.org/10.1093/nargab/lqab078
_version_ 1783747968640221184
author Peng, Chengyao
Dieck, Simon
Schmid, Alexander
Ahmad, Ashar
Knaus, Alexej
Wenzel, Maren
Mehnert, Laura
Zirn, Birgit
Haack, Tobias
Ossowski, Stephan
Wagner, Matias
Brunet, Theresa
Ehmke, Nadja
Danyel, Magdalena
Rosnev, Stanislav
Kamphans, Tom
Nadav, Guy
Fleischer, Nicole
Fröhlich, Holger
Krawitz, Peter
author_facet Peng, Chengyao
Dieck, Simon
Schmid, Alexander
Ahmad, Ashar
Knaus, Alexej
Wenzel, Maren
Mehnert, Laura
Zirn, Birgit
Haack, Tobias
Ossowski, Stephan
Wagner, Matias
Brunet, Theresa
Ehmke, Nadja
Danyel, Magdalena
Rosnev, Stanislav
Kamphans, Tom
Nadav, Guy
Fleischer, Nicole
Fröhlich, Holger
Krawitz, Peter
author_sort Peng, Chengyao
collection PubMed
description Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the Human Phenotype Ontology (HPO), which are increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed; however, the performance of many such tools suffers from an over-representation of atypical cases in the medical literature. This is certainly the case if the algorithm cannot handle features that occur with reduced frequency in a disorder. With Cada, we built a knowledge graph based on both case annotations and disorder annotations. Using network representation learning, we achieve gene prioritization by link prediction. Our results suggest that Cada exhibits superior performance particularly for patients that present with the pathognomonic findings of a disease. Additionally, information about the frequency of occurrence of a feature can readily be incorporated, when available. Crucial in the design of our approach is the use of the growing amount of phenotype–genotype information that diagnostic labs deposit in databases such as ClinVar. By this means, Cada is an ideal reference tool for differential diagnostics in rare disorders that can also be updated regularly.
format Online
Article
Text
id pubmed-8415429
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84154292021-09-09 CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph Peng, Chengyao Dieck, Simon Schmid, Alexander Ahmad, Ashar Knaus, Alexej Wenzel, Maren Mehnert, Laura Zirn, Birgit Haack, Tobias Ossowski, Stephan Wagner, Matias Brunet, Theresa Ehmke, Nadja Danyel, Magdalena Rosnev, Stanislav Kamphans, Tom Nadav, Guy Fleischer, Nicole Fröhlich, Holger Krawitz, Peter NAR Genom Bioinform Methods Article Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the Human Phenotype Ontology (HPO), which are increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed; however, the performance of many such tools suffers from an over-representation of atypical cases in the medical literature. This is certainly the case if the algorithm cannot handle features that occur with reduced frequency in a disorder. With Cada, we built a knowledge graph based on both case annotations and disorder annotations. Using network representation learning, we achieve gene prioritization by link prediction. Our results suggest that Cada exhibits superior performance particularly for patients that present with the pathognomonic findings of a disease. Additionally, information about the frequency of occurrence of a feature can readily be incorporated, when available. Crucial in the design of our approach is the use of the growing amount of phenotype–genotype information that diagnostic labs deposit in databases such as ClinVar. By this means, Cada is an ideal reference tool for differential diagnostics in rare disorders that can also be updated regularly. Oxford University Press 2021-09-03 /pmc/articles/PMC8415429/ /pubmed/34514393 http://dx.doi.org/10.1093/nargab/lqab078 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Article
Peng, Chengyao
Dieck, Simon
Schmid, Alexander
Ahmad, Ashar
Knaus, Alexej
Wenzel, Maren
Mehnert, Laura
Zirn, Birgit
Haack, Tobias
Ossowski, Stephan
Wagner, Matias
Brunet, Theresa
Ehmke, Nadja
Danyel, Magdalena
Rosnev, Stanislav
Kamphans, Tom
Nadav, Guy
Fleischer, Nicole
Fröhlich, Holger
Krawitz, Peter
CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph
title CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph
title_full CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph
title_fullStr CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph
title_full_unstemmed CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph
title_short CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph
title_sort cada: phenotype-driven gene prioritization based on a case-enriched knowledge graph
topic Methods Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415429/
https://www.ncbi.nlm.nih.gov/pubmed/34514393
http://dx.doi.org/10.1093/nargab/lqab078
work_keys_str_mv AT pengchengyao cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT diecksimon cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT schmidalexander cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT ahmadashar cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT knausalexej cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT wenzelmaren cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT mehnertlaura cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT zirnbirgit cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT haacktobias cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT ossowskistephan cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT wagnermatias cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT brunettheresa cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT ehmkenadja cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT danyelmagdalena cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT rosnevstanislav cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT kamphanstom cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT nadavguy cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT fleischernicole cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT frohlichholger cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph
AT krawitzpeter cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph