Cargando…
PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index
MOTIVATION: Sequence alignment is one of the first steps in many modern genomic analyses, such as variant detection, transcript abundance estimation and metagenomic profiling. Unfortunately, it is often a computationally expensive procedure. As the quantity of data and wealth of different assays and...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9502150/ https://www.ncbi.nlm.nih.gov/pubmed/34117875 http://dx.doi.org/10.1093/bioinformatics/btab408 |
_version_ | 1784795635642269696 |
---|---|
author | Almodaresi, Fatemeh Zakeri, Mohsen Patro, Rob |
author_facet | Almodaresi, Fatemeh Zakeri, Mohsen Patro, Rob |
author_sort | Almodaresi, Fatemeh |
collection | PubMed |
description | MOTIVATION: Sequence alignment is one of the first steps in many modern genomic analyses, such as variant detection, transcript abundance estimation and metagenomic profiling. Unfortunately, it is often a computationally expensive procedure. As the quantity of data and wealth of different assays and applications continue to grow, the need for accurate and fast alignment tools that scale to large collections of reference sequences persists. RESULTS: In this article, we introduce PuffAligner, a fast, accurate and versatile aligner built on top of the Pufferfish index. PuffAligner is able to produce highly sensitive alignments, similar to those of Bowtie2, but much more quickly. While exhibiting similar speed to the ultrafast STAR aligner, PuffAligner requires considerably less memory to construct its index and align reads. PuffAligner strikes a desirable balance with respect to the time, space and accuracy tradeoffs made by different alignment tools and provides a promising foundation on which to test new alignment ideas over large collections of sequences. AVAILABILITY AND IMPLEMENTATION: All the data used for preparing the results of this paper can be found with 10.5281/zenodo.4902332. PuffAligner is a free and open-source software. It is implemented in C++14 and can be obtained from https://github.com/COMBINE-lab/pufferfish/tree/cigar-strings. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9502150 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-95021502022-09-26 PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index Almodaresi, Fatemeh Zakeri, Mohsen Patro, Rob Bioinformatics Original Papers MOTIVATION: Sequence alignment is one of the first steps in many modern genomic analyses, such as variant detection, transcript abundance estimation and metagenomic profiling. Unfortunately, it is often a computationally expensive procedure. As the quantity of data and wealth of different assays and applications continue to grow, the need for accurate and fast alignment tools that scale to large collections of reference sequences persists. RESULTS: In this article, we introduce PuffAligner, a fast, accurate and versatile aligner built on top of the Pufferfish index. PuffAligner is able to produce highly sensitive alignments, similar to those of Bowtie2, but much more quickly. While exhibiting similar speed to the ultrafast STAR aligner, PuffAligner requires considerably less memory to construct its index and align reads. PuffAligner strikes a desirable balance with respect to the time, space and accuracy tradeoffs made by different alignment tools and provides a promising foundation on which to test new alignment ideas over large collections of sequences. AVAILABILITY AND IMPLEMENTATION: All the data used for preparing the results of this paper can be found with 10.5281/zenodo.4902332. PuffAligner is a free and open-source software. It is implemented in C++14 and can be obtained from https://github.com/COMBINE-lab/pufferfish/tree/cigar-strings. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-06-12 /pmc/articles/PMC9502150/ /pubmed/34117875 http://dx.doi.org/10.1093/bioinformatics/btab408 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Almodaresi, Fatemeh Zakeri, Mohsen Patro, Rob PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index |
title | PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index |
title_full | PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index |
title_fullStr | PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index |
title_full_unstemmed | PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index |
title_short | PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index |
title_sort | puffaligner: a fast, efficient and accurate aligner based on the pufferfish index |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9502150/ https://www.ncbi.nlm.nih.gov/pubmed/34117875 http://dx.doi.org/10.1093/bioinformatics/btab408 |
work_keys_str_mv | AT almodaresifatemeh puffalignerafastefficientandaccuratealignerbasedonthepufferfishindex AT zakerimohsen puffalignerafastefficientandaccuratealignerbasedonthepufferfishindex AT patrorob puffalignerafastefficientandaccuratealignerbasedonthepufferfishindex |