Cargando…

Domain enhanced lookup time accelerated BLAST

BACKGROUND: BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construc...

Descripción completa

Detalles Bibliográficos
Autores principales: Boratyn, Grzegorz M, Schäffer, Alejandro A, Agarwala, Richa, Altschul, Stephen F, Lipman, David J, Madden, Thomas L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3438057/
https://www.ncbi.nlm.nih.gov/pubmed/22510480
http://dx.doi.org/10.1186/1745-6150-7-12
_version_ 1782242857631875072
author Boratyn, Grzegorz M
Schäffer, Alejandro A
Agarwala, Richa
Altschul, Stephen F
Lipman, David J
Madden, Thomas L
author_facet Boratyn, Grzegorz M
Schäffer, Alejandro A
Agarwala, Richa
Altschul, Stephen F
Lipman, David J
Madden, Thomas L
author_sort Boratyn, Grzegorz M
collection PubMed
description BACKGROUND: BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i + 1. Biegert and Söding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch. RESULTS: We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI’s Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC(5000) of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST. CONCLUSIONS: DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the “Protein BLAST” link at http://blast.ncbi.nlm.nih.gov. REVIEWERS: This article was reviewed by Arcady Mushegian, Nick V. Grishin, and Frank Eisenhaber.
format Online
Article
Text
id pubmed-3438057
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34380572012-09-12 Domain enhanced lookup time accelerated BLAST Boratyn, Grzegorz M Schäffer, Alejandro A Agarwala, Richa Altschul, Stephen F Lipman, David J Madden, Thomas L Biol Direct Research BACKGROUND: BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i + 1. Biegert and Söding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch. RESULTS: We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI’s Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC(5000) of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST. CONCLUSIONS: DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the “Protein BLAST” link at http://blast.ncbi.nlm.nih.gov. REVIEWERS: This article was reviewed by Arcady Mushegian, Nick V. Grishin, and Frank Eisenhaber. BioMed Central 2012-04-17 /pmc/articles/PMC3438057/ /pubmed/22510480 http://dx.doi.org/10.1186/1745-6150-7-12 Text en Copyright ©2012 Boratyn et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Boratyn, Grzegorz M
Schäffer, Alejandro A
Agarwala, Richa
Altschul, Stephen F
Lipman, David J
Madden, Thomas L
Domain enhanced lookup time accelerated BLAST
title Domain enhanced lookup time accelerated BLAST
title_full Domain enhanced lookup time accelerated BLAST
title_fullStr Domain enhanced lookup time accelerated BLAST
title_full_unstemmed Domain enhanced lookup time accelerated BLAST
title_short Domain enhanced lookup time accelerated BLAST
title_sort domain enhanced lookup time accelerated blast
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3438057/
https://www.ncbi.nlm.nih.gov/pubmed/22510480
http://dx.doi.org/10.1186/1745-6150-7-12
work_keys_str_mv AT boratyngrzegorzm domainenhancedlookuptimeacceleratedblast
AT schafferalejandroa domainenhancedlookuptimeacceleratedblast
AT agarwalaricha domainenhancedlookuptimeacceleratedblast
AT altschulstephenf domainenhancedlookuptimeacceleratedblast
AT lipmandavidj domainenhancedlookuptimeacceleratedblast
AT maddenthomasl domainenhancedlookuptimeacceleratedblast