Cargando…
Large-scale protein function prediction using heterogeneous ensembles
Heterogeneous ensembles are an effective approach in scenarios where the ideal data type and/or individual predictor are unclear for a given problem. These ensembles have shown promise for protein function prediction (PFP), but their ability to improve PFP at a large scale is unclear. The overall go...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6221071/ https://www.ncbi.nlm.nih.gov/pubmed/30450194 http://dx.doi.org/10.12688/f1000research.16415.1 |
_version_ | 1783368950770302976 |
---|---|
author | Wang, Linhua Law, Jeffrey Kale, Shiv D. Murali, T. M. Pandey, Gaurav |
author_facet | Wang, Linhua Law, Jeffrey Kale, Shiv D. Murali, T. M. Pandey, Gaurav |
author_sort | Wang, Linhua |
collection | PubMed |
description | Heterogeneous ensembles are an effective approach in scenarios where the ideal data type and/or individual predictor are unclear for a given problem. These ensembles have shown promise for protein function prediction (PFP), but their ability to improve PFP at a large scale is unclear. The overall goal of this study is to critically assess this ability of a variety of heterogeneous ensemble methods across a multitude of functional terms, proteins and organisms. Our results show that these methods, especially Stacking using Logistic Regression, indeed produce more accurate predictions for a variety of Gene Ontology terms differing in size and specificity. To enable the application of these methods to other related problems, we have publicly shared the HPC-enabled code underlying this work as LargeGOPred ( https://github.com/GauravPandeyLab/LargeGOPred). |
format | Online Article Text |
id | pubmed-6221071 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-62210712018-11-15 Large-scale protein function prediction using heterogeneous ensembles Wang, Linhua Law, Jeffrey Kale, Shiv D. Murali, T. M. Pandey, Gaurav F1000Res Method Article Heterogeneous ensembles are an effective approach in scenarios where the ideal data type and/or individual predictor are unclear for a given problem. These ensembles have shown promise for protein function prediction (PFP), but their ability to improve PFP at a large scale is unclear. The overall goal of this study is to critically assess this ability of a variety of heterogeneous ensemble methods across a multitude of functional terms, proteins and organisms. Our results show that these methods, especially Stacking using Logistic Regression, indeed produce more accurate predictions for a variety of Gene Ontology terms differing in size and specificity. To enable the application of these methods to other related problems, we have publicly shared the HPC-enabled code underlying this work as LargeGOPred ( https://github.com/GauravPandeyLab/LargeGOPred). F1000 Research Limited 2018-09-28 /pmc/articles/PMC6221071/ /pubmed/30450194 http://dx.doi.org/10.12688/f1000research.16415.1 Text en Copyright: © 2018 Wang L et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Method Article Wang, Linhua Law, Jeffrey Kale, Shiv D. Murali, T. M. Pandey, Gaurav Large-scale protein function prediction using heterogeneous ensembles |
title | Large-scale protein function prediction using heterogeneous ensembles |
title_full | Large-scale protein function prediction using heterogeneous ensembles |
title_fullStr | Large-scale protein function prediction using heterogeneous ensembles |
title_full_unstemmed | Large-scale protein function prediction using heterogeneous ensembles |
title_short | Large-scale protein function prediction using heterogeneous ensembles |
title_sort | large-scale protein function prediction using heterogeneous ensembles |
topic | Method Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6221071/ https://www.ncbi.nlm.nih.gov/pubmed/30450194 http://dx.doi.org/10.12688/f1000research.16415.1 |
work_keys_str_mv | AT wanglinhua largescaleproteinfunctionpredictionusingheterogeneousensembles AT lawjeffrey largescaleproteinfunctionpredictionusingheterogeneousensembles AT kaleshivd largescaleproteinfunctionpredictionusingheterogeneousensembles AT muralitm largescaleproteinfunctionpredictionusingheterogeneousensembles AT pandeygaurav largescaleproteinfunctionpredictionusingheterogeneousensembles |