Cargando…
Magic-BLAST, an accurate RNA-seq aligner for long and short reads
BACKGROUND: Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas f...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6659269/ https://www.ncbi.nlm.nih.gov/pubmed/31345161 http://dx.doi.org/10.1186/s12859-019-2996-x |
_version_ | 1783439101602562048 |
---|---|
author | Boratyn, Grzegorz M. Thierry-Mieg, Jean Thierry-Mieg, Danielle Busby, Ben Madden, Thomas L. |
author_facet | Boratyn, Grzegorz M. Thierry-Mieg, Jean Thierry-Mieg, Danielle Busby, Ben Madden, Thomas L. |
author_sort | Boratyn, Grzegorz M. |
collection | PubMed |
description | BACKGROUND: Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. RESULTS: Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. CONCLUSIONS: We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2996-x) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6659269 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-66592692019-08-01 Magic-BLAST, an accurate RNA-seq aligner for long and short reads Boratyn, Grzegorz M. Thierry-Mieg, Jean Thierry-Mieg, Danielle Busby, Ben Madden, Thomas L. BMC Bioinformatics Software BACKGROUND: Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. RESULTS: Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. CONCLUSIONS: We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2996-x) contains supplementary material, which is available to authorized users. BioMed Central 2019-07-25 /pmc/articles/PMC6659269/ /pubmed/31345161 http://dx.doi.org/10.1186/s12859-019-2996-x Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Boratyn, Grzegorz M. Thierry-Mieg, Jean Thierry-Mieg, Danielle Busby, Ben Madden, Thomas L. Magic-BLAST, an accurate RNA-seq aligner for long and short reads |
title | Magic-BLAST, an accurate RNA-seq aligner for long and short reads |
title_full | Magic-BLAST, an accurate RNA-seq aligner for long and short reads |
title_fullStr | Magic-BLAST, an accurate RNA-seq aligner for long and short reads |
title_full_unstemmed | Magic-BLAST, an accurate RNA-seq aligner for long and short reads |
title_short | Magic-BLAST, an accurate RNA-seq aligner for long and short reads |
title_sort | magic-blast, an accurate rna-seq aligner for long and short reads |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6659269/ https://www.ncbi.nlm.nih.gov/pubmed/31345161 http://dx.doi.org/10.1186/s12859-019-2996-x |
work_keys_str_mv | AT boratyngrzegorzm magicblastanaccuraternaseqalignerforlongandshortreads AT thierrymiegjean magicblastanaccuraternaseqalignerforlongandshortreads AT thierrymiegdanielle magicblastanaccuraternaseqalignerforlongandshortreads AT busbyben magicblastanaccuraternaseqalignerforlongandshortreads AT maddenthomasl magicblastanaccuraternaseqalignerforlongandshortreads |