Cargando…

Fast local fragment chaining using sum-of-pair gap costs

BACKGROUND: Fast seed-based alignment heuristics such as BLAST and BLAT have become indispensable tools in comparative genomics for all studies aiming at the evolutionary relations of proteins, genes, and non-coding RNAs. This is true in particular for the large mammalian genomes. The sensitivity an...

Descripción completa

Detalles Bibliográficos
Autores principales: Otto, Christian, Hoffmann, Steve, Gorodkin, Jan, Stadler, Peter F
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3072320/
https://www.ncbi.nlm.nih.gov/pubmed/21418573
http://dx.doi.org/10.1186/1748-7188-6-4
_version_ 1782201530964770816
author Otto, Christian
Hoffmann, Steve
Gorodkin, Jan
Stadler, Peter F
author_facet Otto, Christian
Hoffmann, Steve
Gorodkin, Jan
Stadler, Peter F
author_sort Otto, Christian
collection PubMed
description BACKGROUND: Fast seed-based alignment heuristics such as BLAST and BLAT have become indispensable tools in comparative genomics for all studies aiming at the evolutionary relations of proteins, genes, and non-coding RNAs. This is true in particular for the large mammalian genomes. The sensitivity and specificity of these tools, however, crucially depend on parameters such as seed sizes or maximum expectation values. In settings that require high sensitivity the amount of short local match fragments easily becomes intractable. Then, fragment chaining is a powerful leverage to quickly connect, score, and rank the fragments to improve the specificity. RESULTS: Here we present a fast and flexible fragment chainer that for the first time also supports a sum-of-pair gap cost model. This model has proven to achieve a higher accuracy and sensitivity in its own field of application. Due to a highly time-efficient index structure our method outperforms the only existing tool for fragment chaining under the linear gap cost model. It can easily be applied to the output generated by alignment tools such as segemehl or BLAST. As an example we consider homology-based searches for human and mouse snoRNAs demonstrating that a highly sensitive BLAST search with subsequent chaining is an attractive option. The sum-of-pair gap costs provide a substantial advantage is this context. CONCLUSIONS: Chaining of short match fragments helps to quickly and accurately identify regions of homology that may not be found using local alignment heuristics alone. By providing both the linear and the sum-of-pair gap cost model, a wider range of application can be covered. The software clasp is available at http://www.bioinf.uni-leipzig.de/Software/clasp/.
format Text
id pubmed-3072320
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30723202011-04-08 Fast local fragment chaining using sum-of-pair gap costs Otto, Christian Hoffmann, Steve Gorodkin, Jan Stadler, Peter F Algorithms Mol Biol Software Article BACKGROUND: Fast seed-based alignment heuristics such as BLAST and BLAT have become indispensable tools in comparative genomics for all studies aiming at the evolutionary relations of proteins, genes, and non-coding RNAs. This is true in particular for the large mammalian genomes. The sensitivity and specificity of these tools, however, crucially depend on parameters such as seed sizes or maximum expectation values. In settings that require high sensitivity the amount of short local match fragments easily becomes intractable. Then, fragment chaining is a powerful leverage to quickly connect, score, and rank the fragments to improve the specificity. RESULTS: Here we present a fast and flexible fragment chainer that for the first time also supports a sum-of-pair gap cost model. This model has proven to achieve a higher accuracy and sensitivity in its own field of application. Due to a highly time-efficient index structure our method outperforms the only existing tool for fragment chaining under the linear gap cost model. It can easily be applied to the output generated by alignment tools such as segemehl or BLAST. As an example we consider homology-based searches for human and mouse snoRNAs demonstrating that a highly sensitive BLAST search with subsequent chaining is an attractive option. The sum-of-pair gap costs provide a substantial advantage is this context. CONCLUSIONS: Chaining of short match fragments helps to quickly and accurately identify regions of homology that may not be found using local alignment heuristics alone. By providing both the linear and the sum-of-pair gap cost model, a wider range of application can be covered. The software clasp is available at http://www.bioinf.uni-leipzig.de/Software/clasp/. BioMed Central 2011-03-18 /pmc/articles/PMC3072320/ /pubmed/21418573 http://dx.doi.org/10.1186/1748-7188-6-4 Text en Copyright ©2011 Otto et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Article
Otto, Christian
Hoffmann, Steve
Gorodkin, Jan
Stadler, Peter F
Fast local fragment chaining using sum-of-pair gap costs
title Fast local fragment chaining using sum-of-pair gap costs
title_full Fast local fragment chaining using sum-of-pair gap costs
title_fullStr Fast local fragment chaining using sum-of-pair gap costs
title_full_unstemmed Fast local fragment chaining using sum-of-pair gap costs
title_short Fast local fragment chaining using sum-of-pair gap costs
title_sort fast local fragment chaining using sum-of-pair gap costs
topic Software Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3072320/
https://www.ncbi.nlm.nih.gov/pubmed/21418573
http://dx.doi.org/10.1186/1748-7188-6-4
work_keys_str_mv AT ottochristian fastlocalfragmentchainingusingsumofpairgapcosts
AT hoffmannsteve fastlocalfragmentchainingusingsumofpairgapcosts
AT gorodkinjan fastlocalfragmentchainingusingsumofpairgapcosts
AT stadlerpeterf fastlocalfragmentchainingusingsumofpairgapcosts