Cargando…

Towards a more molecular taxonomy of disease

BACKGROUND: Disease taxonomies have been designed for many applications, but they tend not to fully incorporate the growing amount of molecular-level knowledge of disease processes, inhibiting research efforts. Understanding the degree to which we can infer disease relationships from molecular data...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Jisoo, Hescott, Benjamin J., Slonim, Donna K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530939/
https://www.ncbi.nlm.nih.gov/pubmed/28750648
http://dx.doi.org/10.1186/s13326-017-0134-0
_version_ 1783253321408053248
author Park, Jisoo
Hescott, Benjamin J.
Slonim, Donna K.
author_facet Park, Jisoo
Hescott, Benjamin J.
Slonim, Donna K.
author_sort Park, Jisoo
collection PubMed
description BACKGROUND: Disease taxonomies have been designed for many applications, but they tend not to fully incorporate the growing amount of molecular-level knowledge of disease processes, inhibiting research efforts. Understanding the degree to which we can infer disease relationships from molecular data alone may yield insights into how to ultimately construct more modern taxonomies that integrate both physiological and molecular information. RESULTS: We introduce a new technique we call Parent Promotion to infer hierarchical relationships between disease terms using disease-gene data. We compare this technique with both an established ontology inference method (CliXO) and a minimum weight spanning tree approach. Because there is no gold standard molecular disease taxonomy available, we compare our inferred hierarchies to both the Medical Subject Headings (MeSH) category C forest of diseases and to subnetworks of the Disease Ontology (DO). This comparison provides insights about the inference algorithms, choices of evaluation metrics, and the existing molecular content of various subnetworks of MeSH and the DO. Our results suggest that the Parent Promotion method performs well in most cases. Performance across MeSH trees is also correlated between inference methods. Specifically, inferred relationships are more consistent with those in smaller MeSH disease trees than larger ones, but there are some notable exceptions that may correlate with higher molecular content in MeSH. CONCLUSIONS: Our experiments provide insights about learning relationships between diseases from disease genes alone. Future work should explore the prospect of disease term discovery from molecular data and how best to integrate molecular data with anatomical and clinical knowledge. This study nonetheless suggests that disease gene information has the potential to form an important part of the foundation for future representations of the disease landscape. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-017-0134-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5530939
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-55309392017-08-02 Towards a more molecular taxonomy of disease Park, Jisoo Hescott, Benjamin J. Slonim, Donna K. J Biomed Semantics Research BACKGROUND: Disease taxonomies have been designed for many applications, but they tend not to fully incorporate the growing amount of molecular-level knowledge of disease processes, inhibiting research efforts. Understanding the degree to which we can infer disease relationships from molecular data alone may yield insights into how to ultimately construct more modern taxonomies that integrate both physiological and molecular information. RESULTS: We introduce a new technique we call Parent Promotion to infer hierarchical relationships between disease terms using disease-gene data. We compare this technique with both an established ontology inference method (CliXO) and a minimum weight spanning tree approach. Because there is no gold standard molecular disease taxonomy available, we compare our inferred hierarchies to both the Medical Subject Headings (MeSH) category C forest of diseases and to subnetworks of the Disease Ontology (DO). This comparison provides insights about the inference algorithms, choices of evaluation metrics, and the existing molecular content of various subnetworks of MeSH and the DO. Our results suggest that the Parent Promotion method performs well in most cases. Performance across MeSH trees is also correlated between inference methods. Specifically, inferred relationships are more consistent with those in smaller MeSH disease trees than larger ones, but there are some notable exceptions that may correlate with higher molecular content in MeSH. CONCLUSIONS: Our experiments provide insights about learning relationships between diseases from disease genes alone. Future work should explore the prospect of disease term discovery from molecular data and how best to integrate molecular data with anatomical and clinical knowledge. This study nonetheless suggests that disease gene information has the potential to form an important part of the foundation for future representations of the disease landscape. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-017-0134-0) contains supplementary material, which is available to authorized users. BioMed Central 2017-07-27 /pmc/articles/PMC5530939/ /pubmed/28750648 http://dx.doi.org/10.1186/s13326-017-0134-0 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Park, Jisoo
Hescott, Benjamin J.
Slonim, Donna K.
Towards a more molecular taxonomy of disease
title Towards a more molecular taxonomy of disease
title_full Towards a more molecular taxonomy of disease
title_fullStr Towards a more molecular taxonomy of disease
title_full_unstemmed Towards a more molecular taxonomy of disease
title_short Towards a more molecular taxonomy of disease
title_sort towards a more molecular taxonomy of disease
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530939/
https://www.ncbi.nlm.nih.gov/pubmed/28750648
http://dx.doi.org/10.1186/s13326-017-0134-0
work_keys_str_mv AT parkjisoo towardsamoremoleculartaxonomyofdisease
AT hescottbenjaminj towardsamoremoleculartaxonomyofdisease
AT slonimdonnak towardsamoremoleculartaxonomyofdisease