Cargando…
Fast local fragment chaining using sum-of-pair gap costs
BACKGROUND: Fast seed-based alignment heuristics such as BLAST and BLAT have become indispensable tools in comparative genomics for all studies aiming at the evolutionary relations of proteins, genes, and non-coding RNAs. This is true in particular for the large mammalian genomes. The sensitivity an...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3072320/ https://www.ncbi.nlm.nih.gov/pubmed/21418573 http://dx.doi.org/10.1186/1748-7188-6-4 |
_version_ | 1782201530964770816 |
---|---|
author | Otto, Christian Hoffmann, Steve Gorodkin, Jan Stadler, Peter F |
author_facet | Otto, Christian Hoffmann, Steve Gorodkin, Jan Stadler, Peter F |
author_sort | Otto, Christian |
collection | PubMed |
description | BACKGROUND: Fast seed-based alignment heuristics such as BLAST and BLAT have become indispensable tools in comparative genomics for all studies aiming at the evolutionary relations of proteins, genes, and non-coding RNAs. This is true in particular for the large mammalian genomes. The sensitivity and specificity of these tools, however, crucially depend on parameters such as seed sizes or maximum expectation values. In settings that require high sensitivity the amount of short local match fragments easily becomes intractable. Then, fragment chaining is a powerful leverage to quickly connect, score, and rank the fragments to improve the specificity. RESULTS: Here we present a fast and flexible fragment chainer that for the first time also supports a sum-of-pair gap cost model. This model has proven to achieve a higher accuracy and sensitivity in its own field of application. Due to a highly time-efficient index structure our method outperforms the only existing tool for fragment chaining under the linear gap cost model. It can easily be applied to the output generated by alignment tools such as segemehl or BLAST. As an example we consider homology-based searches for human and mouse snoRNAs demonstrating that a highly sensitive BLAST search with subsequent chaining is an attractive option. The sum-of-pair gap costs provide a substantial advantage is this context. CONCLUSIONS: Chaining of short match fragments helps to quickly and accurately identify regions of homology that may not be found using local alignment heuristics alone. By providing both the linear and the sum-of-pair gap cost model, a wider range of application can be covered. The software clasp is available at http://www.bioinf.uni-leipzig.de/Software/clasp/. |
format | Text |
id | pubmed-3072320 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30723202011-04-08 Fast local fragment chaining using sum-of-pair gap costs Otto, Christian Hoffmann, Steve Gorodkin, Jan Stadler, Peter F Algorithms Mol Biol Software Article BACKGROUND: Fast seed-based alignment heuristics such as BLAST and BLAT have become indispensable tools in comparative genomics for all studies aiming at the evolutionary relations of proteins, genes, and non-coding RNAs. This is true in particular for the large mammalian genomes. The sensitivity and specificity of these tools, however, crucially depend on parameters such as seed sizes or maximum expectation values. In settings that require high sensitivity the amount of short local match fragments easily becomes intractable. Then, fragment chaining is a powerful leverage to quickly connect, score, and rank the fragments to improve the specificity. RESULTS: Here we present a fast and flexible fragment chainer that for the first time also supports a sum-of-pair gap cost model. This model has proven to achieve a higher accuracy and sensitivity in its own field of application. Due to a highly time-efficient index structure our method outperforms the only existing tool for fragment chaining under the linear gap cost model. It can easily be applied to the output generated by alignment tools such as segemehl or BLAST. As an example we consider homology-based searches for human and mouse snoRNAs demonstrating that a highly sensitive BLAST search with subsequent chaining is an attractive option. The sum-of-pair gap costs provide a substantial advantage is this context. CONCLUSIONS: Chaining of short match fragments helps to quickly and accurately identify regions of homology that may not be found using local alignment heuristics alone. By providing both the linear and the sum-of-pair gap cost model, a wider range of application can be covered. The software clasp is available at http://www.bioinf.uni-leipzig.de/Software/clasp/. BioMed Central 2011-03-18 /pmc/articles/PMC3072320/ /pubmed/21418573 http://dx.doi.org/10.1186/1748-7188-6-4 Text en Copyright ©2011 Otto et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Article Otto, Christian Hoffmann, Steve Gorodkin, Jan Stadler, Peter F Fast local fragment chaining using sum-of-pair gap costs |
title | Fast local fragment chaining using sum-of-pair gap costs |
title_full | Fast local fragment chaining using sum-of-pair gap costs |
title_fullStr | Fast local fragment chaining using sum-of-pair gap costs |
title_full_unstemmed | Fast local fragment chaining using sum-of-pair gap costs |
title_short | Fast local fragment chaining using sum-of-pair gap costs |
title_sort | fast local fragment chaining using sum-of-pair gap costs |
topic | Software Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3072320/ https://www.ncbi.nlm.nih.gov/pubmed/21418573 http://dx.doi.org/10.1186/1748-7188-6-4 |
work_keys_str_mv | AT ottochristian fastlocalfragmentchainingusingsumofpairgapcosts AT hoffmannsteve fastlocalfragmentchainingusingsumofpairgapcosts AT gorodkinjan fastlocalfragmentchainingusingsumofpairgapcosts AT stadlerpeterf fastlocalfragmentchainingusingsumofpairgapcosts |