Cargando…
Towards a more molecular taxonomy of disease
BACKGROUND: Disease taxonomies have been designed for many applications, but they tend not to fully incorporate the growing amount of molecular-level knowledge of disease processes, inhibiting research efforts. Understanding the degree to which we can infer disease relationships from molecular data...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530939/ https://www.ncbi.nlm.nih.gov/pubmed/28750648 http://dx.doi.org/10.1186/s13326-017-0134-0 |
_version_ | 1783253321408053248 |
---|---|
author | Park, Jisoo Hescott, Benjamin J. Slonim, Donna K. |
author_facet | Park, Jisoo Hescott, Benjamin J. Slonim, Donna K. |
author_sort | Park, Jisoo |
collection | PubMed |
description | BACKGROUND: Disease taxonomies have been designed for many applications, but they tend not to fully incorporate the growing amount of molecular-level knowledge of disease processes, inhibiting research efforts. Understanding the degree to which we can infer disease relationships from molecular data alone may yield insights into how to ultimately construct more modern taxonomies that integrate both physiological and molecular information. RESULTS: We introduce a new technique we call Parent Promotion to infer hierarchical relationships between disease terms using disease-gene data. We compare this technique with both an established ontology inference method (CliXO) and a minimum weight spanning tree approach. Because there is no gold standard molecular disease taxonomy available, we compare our inferred hierarchies to both the Medical Subject Headings (MeSH) category C forest of diseases and to subnetworks of the Disease Ontology (DO). This comparison provides insights about the inference algorithms, choices of evaluation metrics, and the existing molecular content of various subnetworks of MeSH and the DO. Our results suggest that the Parent Promotion method performs well in most cases. Performance across MeSH trees is also correlated between inference methods. Specifically, inferred relationships are more consistent with those in smaller MeSH disease trees than larger ones, but there are some notable exceptions that may correlate with higher molecular content in MeSH. CONCLUSIONS: Our experiments provide insights about learning relationships between diseases from disease genes alone. Future work should explore the prospect of disease term discovery from molecular data and how best to integrate molecular data with anatomical and clinical knowledge. This study nonetheless suggests that disease gene information has the potential to form an important part of the foundation for future representations of the disease landscape. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-017-0134-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5530939 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-55309392017-08-02 Towards a more molecular taxonomy of disease Park, Jisoo Hescott, Benjamin J. Slonim, Donna K. J Biomed Semantics Research BACKGROUND: Disease taxonomies have been designed for many applications, but they tend not to fully incorporate the growing amount of molecular-level knowledge of disease processes, inhibiting research efforts. Understanding the degree to which we can infer disease relationships from molecular data alone may yield insights into how to ultimately construct more modern taxonomies that integrate both physiological and molecular information. RESULTS: We introduce a new technique we call Parent Promotion to infer hierarchical relationships between disease terms using disease-gene data. We compare this technique with both an established ontology inference method (CliXO) and a minimum weight spanning tree approach. Because there is no gold standard molecular disease taxonomy available, we compare our inferred hierarchies to both the Medical Subject Headings (MeSH) category C forest of diseases and to subnetworks of the Disease Ontology (DO). This comparison provides insights about the inference algorithms, choices of evaluation metrics, and the existing molecular content of various subnetworks of MeSH and the DO. Our results suggest that the Parent Promotion method performs well in most cases. Performance across MeSH trees is also correlated between inference methods. Specifically, inferred relationships are more consistent with those in smaller MeSH disease trees than larger ones, but there are some notable exceptions that may correlate with higher molecular content in MeSH. CONCLUSIONS: Our experiments provide insights about learning relationships between diseases from disease genes alone. Future work should explore the prospect of disease term discovery from molecular data and how best to integrate molecular data with anatomical and clinical knowledge. This study nonetheless suggests that disease gene information has the potential to form an important part of the foundation for future representations of the disease landscape. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-017-0134-0) contains supplementary material, which is available to authorized users. BioMed Central 2017-07-27 /pmc/articles/PMC5530939/ /pubmed/28750648 http://dx.doi.org/10.1186/s13326-017-0134-0 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Park, Jisoo Hescott, Benjamin J. Slonim, Donna K. Towards a more molecular taxonomy of disease |
title | Towards a more molecular taxonomy of disease |
title_full | Towards a more molecular taxonomy of disease |
title_fullStr | Towards a more molecular taxonomy of disease |
title_full_unstemmed | Towards a more molecular taxonomy of disease |
title_short | Towards a more molecular taxonomy of disease |
title_sort | towards a more molecular taxonomy of disease |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530939/ https://www.ncbi.nlm.nih.gov/pubmed/28750648 http://dx.doi.org/10.1186/s13326-017-0134-0 |
work_keys_str_mv | AT parkjisoo towardsamoremoleculartaxonomyofdisease AT hescottbenjaminj towardsamoremoleculartaxonomyofdisease AT slonimdonnak towardsamoremoleculartaxonomyofdisease |