Cargando…

Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms

BACKGROUND: Predicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annota...

Descripción completa

Detalles Bibliográficos
Autores principales: Falda, Marco, Toppo, Stefano, Pescarolo, Alessandro, Lavezzo, Enrico, Di Camillo, Barbara, Facchinetti, Andrea, Cilia, Elisa, Velasco, Riccardo, Fontana, Paolo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3314586/
https://www.ncbi.nlm.nih.gov/pubmed/22536960
http://dx.doi.org/10.1186/1471-2105-13-S4-S14
_version_ 1782228106517413888
author Falda, Marco
Toppo, Stefano
Pescarolo, Alessandro
Lavezzo, Enrico
Di Camillo, Barbara
Facchinetti, Andrea
Cilia, Elisa
Velasco, Riccardo
Fontana, Paolo
author_facet Falda, Marco
Toppo, Stefano
Pescarolo, Alessandro
Lavezzo, Enrico
Di Camillo, Barbara
Facchinetti, Andrea
Cilia, Elisa
Velasco, Riccardo
Fontana, Paolo
author_sort Falda, Marco
collection PubMed
description BACKGROUND: Predicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annotations at a genomic scale, are necessary and urgent. In this scenario, the Gene Ontology has provided the means to standardize the annotation classification with a structured vocabulary which can be easily exploited by computational methods. RESULTS: Argot2 is a web-based function prediction tool able to annotate nucleic or protein sequences from small datasets up to entire genomes. It accepts as input a list of sequences in FASTA format, which are processed using BLAST and HMMER searches vs UniProKB and Pfam databases respectively; these sequences are then annotated with GO terms retrieved from the UniProtKB-GOA database and the terms are weighted using the e-values from BLAST and HMMER. The weighted GO terms are processed according to both their semantic similarity relations described by the Gene Ontology and their associated score. The algorithm is based on the original idea developed in a previous tool called Argot. The entire engine has been completely rewritten to improve both accuracy and computational efficiency, thus allowing for the annotation of complete genomes. CONCLUSIONS: The revised algorithm has been already employed and successfully tested during in-house genome projects of grape and apple, and has proven to have a high precision and recall in all our benchmark conditions. It has also been successfully compared with Blast2GO, one of the methods most commonly employed for sequence annotation. The server is freely accessible at http://www.medcomp.medicina.unipd.it/Argot2.
format Online
Article
Text
id pubmed-3314586
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33145862012-04-02 Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms Falda, Marco Toppo, Stefano Pescarolo, Alessandro Lavezzo, Enrico Di Camillo, Barbara Facchinetti, Andrea Cilia, Elisa Velasco, Riccardo Fontana, Paolo BMC Bioinformatics Research BACKGROUND: Predicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annotations at a genomic scale, are necessary and urgent. In this scenario, the Gene Ontology has provided the means to standardize the annotation classification with a structured vocabulary which can be easily exploited by computational methods. RESULTS: Argot2 is a web-based function prediction tool able to annotate nucleic or protein sequences from small datasets up to entire genomes. It accepts as input a list of sequences in FASTA format, which are processed using BLAST and HMMER searches vs UniProKB and Pfam databases respectively; these sequences are then annotated with GO terms retrieved from the UniProtKB-GOA database and the terms are weighted using the e-values from BLAST and HMMER. The weighted GO terms are processed according to both their semantic similarity relations described by the Gene Ontology and their associated score. The algorithm is based on the original idea developed in a previous tool called Argot. The entire engine has been completely rewritten to improve both accuracy and computational efficiency, thus allowing for the annotation of complete genomes. CONCLUSIONS: The revised algorithm has been already employed and successfully tested during in-house genome projects of grape and apple, and has proven to have a high precision and recall in all our benchmark conditions. It has also been successfully compared with Blast2GO, one of the methods most commonly employed for sequence annotation. The server is freely accessible at http://www.medcomp.medicina.unipd.it/Argot2. BioMed Central 2012-03-28 /pmc/articles/PMC3314586/ /pubmed/22536960 http://dx.doi.org/10.1186/1471-2105-13-S4-S14 Text en Copyright ©2012 Falda et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Falda, Marco
Toppo, Stefano
Pescarolo, Alessandro
Lavezzo, Enrico
Di Camillo, Barbara
Facchinetti, Andrea
Cilia, Elisa
Velasco, Riccardo
Fontana, Paolo
Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms
title Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms
title_full Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms
title_fullStr Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms
title_full_unstemmed Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms
title_short Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms
title_sort argot2: a large scale function prediction tool relying on semantic similarity of weighted gene ontology terms
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3314586/
https://www.ncbi.nlm.nih.gov/pubmed/22536960
http://dx.doi.org/10.1186/1471-2105-13-S4-S14
work_keys_str_mv AT faldamarco argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms
AT toppostefano argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms
AT pescaroloalessandro argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms
AT lavezzoenrico argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms
AT dicamillobarbara argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms
AT facchinettiandrea argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms
AT ciliaelisa argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms
AT velascoriccardo argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms
AT fontanapaolo argot2alargescalefunctionpredictiontoolrelyingonsemanticsimilarityofweightedgeneontologyterms