Cargando…
Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods
BACKGROUND: The prediction of human gene–abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5639780/ https://www.ncbi.nlm.nih.gov/pubmed/29025394 http://dx.doi.org/10.1186/s12859-017-1854-y |
_version_ | 1783270944821739520 |
---|---|
author | Notaro, Marco Schubach, Max Robinson, Peter N. Valentini, Giorgio |
author_facet | Notaro, Marco Schubach, Max Robinson, Peter N. Valentini, Giorgio |
author_sort | Notaro, Marco |
collection | PubMed |
description | BACKGROUND: The prediction of human gene–abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene–disease associations has been widely investigated, the related problem of gene–phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. RESULTS: We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a “flat” learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of-the-art algorithms and with a significant reduction of the computational complexity. CONCLUSIONS: Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1854-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5639780 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56397802017-10-18 Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods Notaro, Marco Schubach, Max Robinson, Peter N. Valentini, Giorgio BMC Bioinformatics Research Article BACKGROUND: The prediction of human gene–abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene–disease associations has been widely investigated, the related problem of gene–phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. RESULTS: We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a “flat” learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of-the-art algorithms and with a significant reduction of the computational complexity. CONCLUSIONS: Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1854-y) contains supplementary material, which is available to authorized users. BioMed Central 2017-10-12 /pmc/articles/PMC5639780/ /pubmed/29025394 http://dx.doi.org/10.1186/s12859-017-1854-y Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Notaro, Marco Schubach, Max Robinson, Peter N. Valentini, Giorgio Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods |
title | Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods |
title_full | Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods |
title_fullStr | Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods |
title_full_unstemmed | Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods |
title_short | Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods |
title_sort | prediction of human phenotype ontology terms by means of hierarchical ensemble methods |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5639780/ https://www.ncbi.nlm.nih.gov/pubmed/29025394 http://dx.doi.org/10.1186/s12859-017-1854-y |
work_keys_str_mv | AT notaromarco predictionofhumanphenotypeontologytermsbymeansofhierarchicalensemblemethods AT schubachmax predictionofhumanphenotypeontologytermsbymeansofhierarchicalensemblemethods AT robinsonpetern predictionofhumanphenotypeontologytermsbymeansofhierarchicalensemblemethods AT valentinigiorgio predictionofhumanphenotypeontologytermsbymeansofhierarchicalensemblemethods |