Cargando…
CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph
Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the Human Phenotype Ontology (HPO), which are increasingly used in electronic health records (EHRs), too. Many algori...
Autores principales: | , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415429/ https://www.ncbi.nlm.nih.gov/pubmed/34514393 http://dx.doi.org/10.1093/nargab/lqab078 |
_version_ | 1783747968640221184 |
---|---|
author | Peng, Chengyao Dieck, Simon Schmid, Alexander Ahmad, Ashar Knaus, Alexej Wenzel, Maren Mehnert, Laura Zirn, Birgit Haack, Tobias Ossowski, Stephan Wagner, Matias Brunet, Theresa Ehmke, Nadja Danyel, Magdalena Rosnev, Stanislav Kamphans, Tom Nadav, Guy Fleischer, Nicole Fröhlich, Holger Krawitz, Peter |
author_facet | Peng, Chengyao Dieck, Simon Schmid, Alexander Ahmad, Ashar Knaus, Alexej Wenzel, Maren Mehnert, Laura Zirn, Birgit Haack, Tobias Ossowski, Stephan Wagner, Matias Brunet, Theresa Ehmke, Nadja Danyel, Magdalena Rosnev, Stanislav Kamphans, Tom Nadav, Guy Fleischer, Nicole Fröhlich, Holger Krawitz, Peter |
author_sort | Peng, Chengyao |
collection | PubMed |
description | Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the Human Phenotype Ontology (HPO), which are increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed; however, the performance of many such tools suffers from an over-representation of atypical cases in the medical literature. This is certainly the case if the algorithm cannot handle features that occur with reduced frequency in a disorder. With Cada, we built a knowledge graph based on both case annotations and disorder annotations. Using network representation learning, we achieve gene prioritization by link prediction. Our results suggest that Cada exhibits superior performance particularly for patients that present with the pathognomonic findings of a disease. Additionally, information about the frequency of occurrence of a feature can readily be incorporated, when available. Crucial in the design of our approach is the use of the growing amount of phenotype–genotype information that diagnostic labs deposit in databases such as ClinVar. By this means, Cada is an ideal reference tool for differential diagnostics in rare disorders that can also be updated regularly. |
format | Online Article Text |
id | pubmed-8415429 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-84154292021-09-09 CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph Peng, Chengyao Dieck, Simon Schmid, Alexander Ahmad, Ashar Knaus, Alexej Wenzel, Maren Mehnert, Laura Zirn, Birgit Haack, Tobias Ossowski, Stephan Wagner, Matias Brunet, Theresa Ehmke, Nadja Danyel, Magdalena Rosnev, Stanislav Kamphans, Tom Nadav, Guy Fleischer, Nicole Fröhlich, Holger Krawitz, Peter NAR Genom Bioinform Methods Article Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the Human Phenotype Ontology (HPO), which are increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed; however, the performance of many such tools suffers from an over-representation of atypical cases in the medical literature. This is certainly the case if the algorithm cannot handle features that occur with reduced frequency in a disorder. With Cada, we built a knowledge graph based on both case annotations and disorder annotations. Using network representation learning, we achieve gene prioritization by link prediction. Our results suggest that Cada exhibits superior performance particularly for patients that present with the pathognomonic findings of a disease. Additionally, information about the frequency of occurrence of a feature can readily be incorporated, when available. Crucial in the design of our approach is the use of the growing amount of phenotype–genotype information that diagnostic labs deposit in databases such as ClinVar. By this means, Cada is an ideal reference tool for differential diagnostics in rare disorders that can also be updated regularly. Oxford University Press 2021-09-03 /pmc/articles/PMC8415429/ /pubmed/34514393 http://dx.doi.org/10.1093/nargab/lqab078 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Article Peng, Chengyao Dieck, Simon Schmid, Alexander Ahmad, Ashar Knaus, Alexej Wenzel, Maren Mehnert, Laura Zirn, Birgit Haack, Tobias Ossowski, Stephan Wagner, Matias Brunet, Theresa Ehmke, Nadja Danyel, Magdalena Rosnev, Stanislav Kamphans, Tom Nadav, Guy Fleischer, Nicole Fröhlich, Holger Krawitz, Peter CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph |
title | CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph |
title_full | CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph |
title_fullStr | CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph |
title_full_unstemmed | CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph |
title_short | CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph |
title_sort | cada: phenotype-driven gene prioritization based on a case-enriched knowledge graph |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415429/ https://www.ncbi.nlm.nih.gov/pubmed/34514393 http://dx.doi.org/10.1093/nargab/lqab078 |
work_keys_str_mv | AT pengchengyao cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT diecksimon cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT schmidalexander cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT ahmadashar cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT knausalexej cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT wenzelmaren cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT mehnertlaura cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT zirnbirgit cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT haacktobias cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT ossowskistephan cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT wagnermatias cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT brunettheresa cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT ehmkenadja cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT danyelmagdalena cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT rosnevstanislav cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT kamphanstom cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT nadavguy cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT fleischernicole cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT frohlichholger cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph AT krawitzpeter cadaphenotypedrivengeneprioritizationbasedonacaseenrichedknowledgegraph |