Cargando…
The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment
The optimal gapped local alignment score of two random sequences follows a Gumbel distribution. The Gumbel distribution has two parameters, the scale parameter λ and the pre-factor k. Presently, the basic local alignment search tool (BLAST) programs (BLASTP (BLAST for proteins), PSI-BLAST, etc.) use...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1199557/ https://www.ncbi.nlm.nih.gov/pubmed/16147981 http://dx.doi.org/10.1093/nar/gki800 |
_version_ | 1782124868603478016 |
---|---|
author | Sheetlin, Sergey Park, Yonil Spouge, John L. |
author_facet | Sheetlin, Sergey Park, Yonil Spouge, John L. |
author_sort | Sheetlin, Sergey |
collection | PubMed |
description | The optimal gapped local alignment score of two random sequences follows a Gumbel distribution. The Gumbel distribution has two parameters, the scale parameter λ and the pre-factor k. Presently, the basic local alignment search tool (BLAST) programs (BLASTP (BLAST for proteins), PSI-BLAST, etc.) use all time-consuming computer simulations to determine the Gumbel parameters. Because the simulations must be done offline, BLAST users are restricted in their choice of alignment scoring schemes. The ultimate aim of this paper is to speed the simulations, to determine the Gumbel parameters online, and to remove the corresponding restrictions on BLAST users. Simulations for the scale parameter λ can be as much as five times faster, if they use global instead of local alignment [R. Bundschuh (2002) J. Comput. Biol., 9, 243–260]. Unfortunately, the acceleration does not extend in determining the Gumbel pre-factor k, because k has no known mathematical relationship to global alignment. This paper relates k to global alignment and exploits the relationship to show that for the BLASTP defaults, 10 000 realizations with sequences of average length 140 suffice to estimate both Gumbel parameters λ and k within the errors required (λ, 0.8%; k, 10%). For the BLASTP defaults, simulations for both Gumbel parameters now take less than 30 s on a 2.8 GHz Pentium 4 processor. |
format | Text |
id | pubmed-1199557 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-11995572005-09-15 The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment Sheetlin, Sergey Park, Yonil Spouge, John L. Nucleic Acids Res Article The optimal gapped local alignment score of two random sequences follows a Gumbel distribution. The Gumbel distribution has two parameters, the scale parameter λ and the pre-factor k. Presently, the basic local alignment search tool (BLAST) programs (BLASTP (BLAST for proteins), PSI-BLAST, etc.) use all time-consuming computer simulations to determine the Gumbel parameters. Because the simulations must be done offline, BLAST users are restricted in their choice of alignment scoring schemes. The ultimate aim of this paper is to speed the simulations, to determine the Gumbel parameters online, and to remove the corresponding restrictions on BLAST users. Simulations for the scale parameter λ can be as much as five times faster, if they use global instead of local alignment [R. Bundschuh (2002) J. Comput. Biol., 9, 243–260]. Unfortunately, the acceleration does not extend in determining the Gumbel pre-factor k, because k has no known mathematical relationship to global alignment. This paper relates k to global alignment and exploits the relationship to show that for the BLASTP defaults, 10 000 realizations with sequences of average length 140 suffice to estimate both Gumbel parameters λ and k within the errors required (λ, 0.8%; k, 10%). For the BLASTP defaults, simulations for both Gumbel parameters now take less than 30 s on a 2.8 GHz Pentium 4 processor. Oxford University Press 2005 2005-09-06 /pmc/articles/PMC1199557/ /pubmed/16147981 http://dx.doi.org/10.1093/nar/gki800 Text en © The Author 2005. Published by Oxford University Press. All rights reserved |
spellingShingle | Article Sheetlin, Sergey Park, Yonil Spouge, John L. The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment |
title | The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment |
title_full | The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment |
title_fullStr | The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment |
title_full_unstemmed | The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment |
title_short | The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment |
title_sort | gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1199557/ https://www.ncbi.nlm.nih.gov/pubmed/16147981 http://dx.doi.org/10.1093/nar/gki800 |
work_keys_str_mv | AT sheetlinsergey thegumbelprefactorkforgappedlocalalignmentcanbeestimatedfromsimulationsofglobalalignment AT parkyonil thegumbelprefactorkforgappedlocalalignmentcanbeestimatedfromsimulationsofglobalalignment AT spougejohnl thegumbelprefactorkforgappedlocalalignmentcanbeestimatedfromsimulationsofglobalalignment AT sheetlinsergey gumbelprefactorkforgappedlocalalignmentcanbeestimatedfromsimulationsofglobalalignment AT parkyonil gumbelprefactorkforgappedlocalalignmentcanbeestimatedfromsimulationsofglobalalignment AT spougejohnl gumbelprefactorkforgappedlocalalignmentcanbeestimatedfromsimulationsofglobalalignment |