Cargando…
A combined approach for genome wide protein function annotation/prediction
BACKGROUND: Today large scale genome sequencing technologies are uncovering an increasing amount of new genes and proteins, which remain uncharacterized. Experimental procedures for protein function prediction are low throughput by nature and thus can't be used to keep up with the rate at which...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3909112/ https://www.ncbi.nlm.nih.gov/pubmed/24564915 http://dx.doi.org/10.1186/1477-5956-11-S1-S1 |
_version_ | 1782301792066863104 |
---|---|
author | Benso, Alfredo Di Carlo, Stefano ur Rehman, Hafeez Politano, Gianfranco Savino, Alessandro Suravajhala, Prashanth |
author_facet | Benso, Alfredo Di Carlo, Stefano ur Rehman, Hafeez Politano, Gianfranco Savino, Alessandro Suravajhala, Prashanth |
author_sort | Benso, Alfredo |
collection | PubMed |
description | BACKGROUND: Today large scale genome sequencing technologies are uncovering an increasing amount of new genes and proteins, which remain uncharacterized. Experimental procedures for protein function prediction are low throughput by nature and thus can't be used to keep up with the rate at which new proteins are discovered. On the other hand, proteins are the prominent stakeholders in almost all biological processes, and therefore the need to precisely know their functions for a better understanding of the underlying biological mechanism is inevitable. The challenge of annotating uncharacterized proteins in functional genomics and biology in general motivates the use of computational techniques well orchestrated to accurately predict their functions. METHODS: We propose a computational flow for the functional annotation of a protein able to assign the most probable functions to a protein by aggregating heterogeneous information. Considered information include: protein motifs, protein sequence similarity, and protein homology data gathered from interacting proteins, combined with data from highly similar non-interacting proteins (hereinafter called Similactors). Moreover, to increase the predictive power of our model we also compute and integrate term specific relationships among functional terms based on Gene Ontology (GO). RESULTS: We tested our method on Saccharomyces Cerevisiae and Homo sapiens species proteins. The aggregation of different structural and functional evidence with GO relationships outperforms, in terms of precision and accuracy of prediction than the other methods reported in literature. The predicted precision and accuracy is 100% for more than half of the input set for both species; overall, we obtained 85.38% precision and 81.95% accuracy for Homo sapiens and 79.73% precision and 80.06% accuracy for Saccharomyces Cerevisiae species proteins. |
format | Online Article Text |
id | pubmed-3909112 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39091122014-02-13 A combined approach for genome wide protein function annotation/prediction Benso, Alfredo Di Carlo, Stefano ur Rehman, Hafeez Politano, Gianfranco Savino, Alessandro Suravajhala, Prashanth Proteome Sci Research BACKGROUND: Today large scale genome sequencing technologies are uncovering an increasing amount of new genes and proteins, which remain uncharacterized. Experimental procedures for protein function prediction are low throughput by nature and thus can't be used to keep up with the rate at which new proteins are discovered. On the other hand, proteins are the prominent stakeholders in almost all biological processes, and therefore the need to precisely know their functions for a better understanding of the underlying biological mechanism is inevitable. The challenge of annotating uncharacterized proteins in functional genomics and biology in general motivates the use of computational techniques well orchestrated to accurately predict their functions. METHODS: We propose a computational flow for the functional annotation of a protein able to assign the most probable functions to a protein by aggregating heterogeneous information. Considered information include: protein motifs, protein sequence similarity, and protein homology data gathered from interacting proteins, combined with data from highly similar non-interacting proteins (hereinafter called Similactors). Moreover, to increase the predictive power of our model we also compute and integrate term specific relationships among functional terms based on Gene Ontology (GO). RESULTS: We tested our method on Saccharomyces Cerevisiae and Homo sapiens species proteins. The aggregation of different structural and functional evidence with GO relationships outperforms, in terms of precision and accuracy of prediction than the other methods reported in literature. The predicted precision and accuracy is 100% for more than half of the input set for both species; overall, we obtained 85.38% precision and 81.95% accuracy for Homo sapiens and 79.73% precision and 80.06% accuracy for Saccharomyces Cerevisiae species proteins. BioMed Central 2013-11-07 /pmc/articles/PMC3909112/ /pubmed/24564915 http://dx.doi.org/10.1186/1477-5956-11-S1-S1 Text en Copyright © 2013 Benso et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Benso, Alfredo Di Carlo, Stefano ur Rehman, Hafeez Politano, Gianfranco Savino, Alessandro Suravajhala, Prashanth A combined approach for genome wide protein function annotation/prediction |
title | A combined approach for genome wide protein function annotation/prediction |
title_full | A combined approach for genome wide protein function annotation/prediction |
title_fullStr | A combined approach for genome wide protein function annotation/prediction |
title_full_unstemmed | A combined approach for genome wide protein function annotation/prediction |
title_short | A combined approach for genome wide protein function annotation/prediction |
title_sort | combined approach for genome wide protein function annotation/prediction |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3909112/ https://www.ncbi.nlm.nih.gov/pubmed/24564915 http://dx.doi.org/10.1186/1477-5956-11-S1-S1 |
work_keys_str_mv | AT bensoalfredo acombinedapproachforgenomewideproteinfunctionannotationprediction AT dicarlostefano acombinedapproachforgenomewideproteinfunctionannotationprediction AT urrehmanhafeez acombinedapproachforgenomewideproteinfunctionannotationprediction AT politanogianfranco acombinedapproachforgenomewideproteinfunctionannotationprediction AT savinoalessandro acombinedapproachforgenomewideproteinfunctionannotationprediction AT suravajhalaprashanth acombinedapproachforgenomewideproteinfunctionannotationprediction AT bensoalfredo combinedapproachforgenomewideproteinfunctionannotationprediction AT dicarlostefano combinedapproachforgenomewideproteinfunctionannotationprediction AT urrehmanhafeez combinedapproachforgenomewideproteinfunctionannotationprediction AT politanogianfranco combinedapproachforgenomewideproteinfunctionannotationprediction AT savinoalessandro combinedapproachforgenomewideproteinfunctionannotationprediction AT suravajhalaprashanth combinedapproachforgenomewideproteinfunctionannotationprediction |