Cargando…

ProFAT: a web-based tool for the functional annotation of protein sequences

BACKGROUND: The functional annotation of proteins relies on published information concerning their close and remote homologues in sequence databases. Evidence for remote sequence similarity can be further strengthened by a similar biological background of the query sequence and identified database s...

Descripción completa

Detalles Bibliográficos
Autores principales: Bradshaw, Charles Richard, Surendranath, Vineeth, Habermann, Bianca
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636073/
https://www.ncbi.nlm.nih.gov/pubmed/17059594
http://dx.doi.org/10.1186/1471-2105-7-466
_version_ 1782130730810212352
author Bradshaw, Charles Richard
Surendranath, Vineeth
Habermann, Bianca
author_facet Bradshaw, Charles Richard
Surendranath, Vineeth
Habermann, Bianca
author_sort Bradshaw, Charles Richard
collection PubMed
description BACKGROUND: The functional annotation of proteins relies on published information concerning their close and remote homologues in sequence databases. Evidence for remote sequence similarity can be further strengthened by a similar biological background of the query sequence and identified database sequences. However, few tools exist so far, that provide a means to include functional information in sequence database searches. RESULTS: We present ProFAT, a web-based tool for the functional annotation of protein sequences based on remote sequence similarity. ProFAT combines sensitive sequence database search methods and a fold recognition algorithm with a simple text-mining approach. ProFAT extracts identified hits based on their biological background by keyword-mining of annotations, features and most importantly, literature associated with a sequence entry. A user-provided keyword list enables the user to specifically search for weak, but biologically relevant homologues of an input query. The ProFAT server has been evaluated using the complete set of proteins from three different domain families, including their weak relatives and could correctly identify between 90% and 100% of all domain family members studied in this context. ProFAT has furthermore been applied to a variety of proteins from different cellular contexts and we provide evidence on how ProFAT can help in functional prediction of proteins based on remotely conserved proteins. CONCLUSION: By employing sensitive database search programs as well as exploiting the functional information associated with database sequences, ProFAT can detect remote, but biologically relevant relationships between proteins and will assist researchers in the prediction of protein function based on remote homologies.
format Text
id pubmed-1636073
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-16360732006-11-15 ProFAT: a web-based tool for the functional annotation of protein sequences Bradshaw, Charles Richard Surendranath, Vineeth Habermann, Bianca BMC Bioinformatics Software BACKGROUND: The functional annotation of proteins relies on published information concerning their close and remote homologues in sequence databases. Evidence for remote sequence similarity can be further strengthened by a similar biological background of the query sequence and identified database sequences. However, few tools exist so far, that provide a means to include functional information in sequence database searches. RESULTS: We present ProFAT, a web-based tool for the functional annotation of protein sequences based on remote sequence similarity. ProFAT combines sensitive sequence database search methods and a fold recognition algorithm with a simple text-mining approach. ProFAT extracts identified hits based on their biological background by keyword-mining of annotations, features and most importantly, literature associated with a sequence entry. A user-provided keyword list enables the user to specifically search for weak, but biologically relevant homologues of an input query. The ProFAT server has been evaluated using the complete set of proteins from three different domain families, including their weak relatives and could correctly identify between 90% and 100% of all domain family members studied in this context. ProFAT has furthermore been applied to a variety of proteins from different cellular contexts and we provide evidence on how ProFAT can help in functional prediction of proteins based on remotely conserved proteins. CONCLUSION: By employing sensitive database search programs as well as exploiting the functional information associated with database sequences, ProFAT can detect remote, but biologically relevant relationships between proteins and will assist researchers in the prediction of protein function based on remote homologies. BioMed Central 2006-10-23 /pmc/articles/PMC1636073/ /pubmed/17059594 http://dx.doi.org/10.1186/1471-2105-7-466 Text en Copyright © 2006 Bradshaw et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Bradshaw, Charles Richard
Surendranath, Vineeth
Habermann, Bianca
ProFAT: a web-based tool for the functional annotation of protein sequences
title ProFAT: a web-based tool for the functional annotation of protein sequences
title_full ProFAT: a web-based tool for the functional annotation of protein sequences
title_fullStr ProFAT: a web-based tool for the functional annotation of protein sequences
title_full_unstemmed ProFAT: a web-based tool for the functional annotation of protein sequences
title_short ProFAT: a web-based tool for the functional annotation of protein sequences
title_sort profat: a web-based tool for the functional annotation of protein sequences
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636073/
https://www.ncbi.nlm.nih.gov/pubmed/17059594
http://dx.doi.org/10.1186/1471-2105-7-466
work_keys_str_mv AT bradshawcharlesrichard profatawebbasedtoolforthefunctionalannotationofproteinsequences
AT surendranathvineeth profatawebbasedtoolforthefunctionalannotationofproteinsequences
AT habermannbianca profatawebbasedtoolforthefunctionalannotationofproteinsequences