Cargando…
New finite-size correction for local alignment score distributions
BACKGROUND: Local alignment programs often calculate the probability that a match occurred by chance. The calculation of this probability may require a “finite-size” correction to the lengths of the sequences, as an alignment that starts near the end of either sequence may run out of sequence before...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483159/ https://www.ncbi.nlm.nih.gov/pubmed/22691307 http://dx.doi.org/10.1186/1756-0500-5-286 |
_version_ | 1782247953638883328 |
---|---|
author | Park, Yonil Sheetlin, Sergey Ma, Ning Madden, Thomas L Spouge, John L |
author_facet | Park, Yonil Sheetlin, Sergey Ma, Ning Madden, Thomas L Spouge, John L |
author_sort | Park, Yonil |
collection | PubMed |
description | BACKGROUND: Local alignment programs often calculate the probability that a match occurred by chance. The calculation of this probability may require a “finite-size” correction to the lengths of the sequences, as an alignment that starts near the end of either sequence may run out of sequence before achieving a significant score. FINDINGS: We present an improved finite-size correction that considers the distribution of sequence lengths rather than simply the corresponding means. This approach improves sensitivity and avoids substituting an ad hoc length for short sequences that can underestimate the significance of a match. We use a test set derived from ASTRAL to show improved ROC scores, especially for shorter sequences. CONCLUSIONS: The new finite-size correction improves the calculation of probabilities for a local alignment. It is now used in the BLAST+ package and at the NCBI BLAST web site ( http://blast.ncbi.nlm.nih.gov). |
format | Online Article Text |
id | pubmed-3483159 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-34831592012-11-05 New finite-size correction for local alignment score distributions Park, Yonil Sheetlin, Sergey Ma, Ning Madden, Thomas L Spouge, John L BMC Res Notes Technical Note BACKGROUND: Local alignment programs often calculate the probability that a match occurred by chance. The calculation of this probability may require a “finite-size” correction to the lengths of the sequences, as an alignment that starts near the end of either sequence may run out of sequence before achieving a significant score. FINDINGS: We present an improved finite-size correction that considers the distribution of sequence lengths rather than simply the corresponding means. This approach improves sensitivity and avoids substituting an ad hoc length for short sequences that can underestimate the significance of a match. We use a test set derived from ASTRAL to show improved ROC scores, especially for shorter sequences. CONCLUSIONS: The new finite-size correction improves the calculation of probabilities for a local alignment. It is now used in the BLAST+ package and at the NCBI BLAST web site ( http://blast.ncbi.nlm.nih.gov). BioMed Central 2012-06-12 /pmc/articles/PMC3483159/ /pubmed/22691307 http://dx.doi.org/10.1186/1756-0500-5-286 Text en Copyright ©2012 Park et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Park, Yonil Sheetlin, Sergey Ma, Ning Madden, Thomas L Spouge, John L New finite-size correction for local alignment score distributions |
title | New finite-size correction for local alignment score distributions |
title_full | New finite-size correction for local alignment score distributions |
title_fullStr | New finite-size correction for local alignment score distributions |
title_full_unstemmed | New finite-size correction for local alignment score distributions |
title_short | New finite-size correction for local alignment score distributions |
title_sort | new finite-size correction for local alignment score distributions |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483159/ https://www.ncbi.nlm.nih.gov/pubmed/22691307 http://dx.doi.org/10.1186/1756-0500-5-286 |
work_keys_str_mv | AT parkyonil newfinitesizecorrectionforlocalalignmentscoredistributions AT sheetlinsergey newfinitesizecorrectionforlocalalignmentscoredistributions AT maning newfinitesizecorrectionforlocalalignmentscoredistributions AT maddenthomasl newfinitesizecorrectionforlocalalignmentscoredistributions AT spougejohnl newfinitesizecorrectionforlocalalignmentscoredistributions |