Cargando…

Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider

Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of comm...

Descripción completa

Detalles Bibliográficos
Autores principales: Musich, Ryan, Cadle-Davidson, Lance, Osier, Michael V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8087178/
https://www.ncbi.nlm.nih.gov/pubmed/33936141
http://dx.doi.org/10.3389/fpls.2021.657240
_version_ 1783686625516060672
author Musich, Ryan
Cadle-Davidson, Lance
Osier, Michael V.
author_facet Musich, Ryan
Cadle-Davidson, Lance
Osier, Michael V.
author_sort Musich, Ryan
collection PubMed
description Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of common algorithms and programs in a way that should be approachable to biologists with limited experience in bioinformatics. We will only in passing consider the effects of data cleanup, a precursor analysis to most alignment tools, and no consideration will be given to downstream processing of the aligned fragments. To compare aligners [Bowtie2, Burrows Wheeler Aligner (BWA), HISAT2, MUMmer4, STAR, and TopHat2], an RNA-seq dataset was used containing data from 48 geographically distinct samples of the grapevine powdery mildew fungus Erysiphe necator. Based on alignment rate and gene coverage, all aligners performed well with the exception of TopHat2, which HISAT2 superseded. BWA perhaps had the best performance in these metrics, except for longer transcripts (>500 bp) for which HISAT2 and STAR performed well. HISAT2 was ~3-fold faster than the next fastest aligner in runtime, which we consider a secondary factor in most alignments. At the end, this direct comparison of commonly used aligners illustrates key considerations when choosing which tool to use for the specific sequencing data and objectives. No single tool meets all needs for every user, and there are many quality aligners available.
format Online
Article
Text
id pubmed-8087178
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-80871782021-05-01 Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider Musich, Ryan Cadle-Davidson, Lance Osier, Michael V. Front Plant Sci Plant Science Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of common algorithms and programs in a way that should be approachable to biologists with limited experience in bioinformatics. We will only in passing consider the effects of data cleanup, a precursor analysis to most alignment tools, and no consideration will be given to downstream processing of the aligned fragments. To compare aligners [Bowtie2, Burrows Wheeler Aligner (BWA), HISAT2, MUMmer4, STAR, and TopHat2], an RNA-seq dataset was used containing data from 48 geographically distinct samples of the grapevine powdery mildew fungus Erysiphe necator. Based on alignment rate and gene coverage, all aligners performed well with the exception of TopHat2, which HISAT2 superseded. BWA perhaps had the best performance in these metrics, except for longer transcripts (>500 bp) for which HISAT2 and STAR performed well. HISAT2 was ~3-fold faster than the next fastest aligner in runtime, which we consider a secondary factor in most alignments. At the end, this direct comparison of commonly used aligners illustrates key considerations when choosing which tool to use for the specific sequencing data and objectives. No single tool meets all needs for every user, and there are many quality aligners available. Frontiers Media S.A. 2021-04-16 /pmc/articles/PMC8087178/ /pubmed/33936141 http://dx.doi.org/10.3389/fpls.2021.657240 Text en Copyright © 2021 Musich, Cadle-Davidson and Osier. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Musich, Ryan
Cadle-Davidson, Lance
Osier, Michael V.
Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider
title Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider
title_full Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider
title_fullStr Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider
title_full_unstemmed Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider
title_short Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider
title_sort comparison of short-read sequence aligners indicates strengths and weaknesses for biologists to consider
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8087178/
https://www.ncbi.nlm.nih.gov/pubmed/33936141
http://dx.doi.org/10.3389/fpls.2021.657240
work_keys_str_mv AT musichryan comparisonofshortreadsequencealignersindicatesstrengthsandweaknessesforbiologiststoconsider
AT cadledavidsonlance comparisonofshortreadsequencealignersindicatesstrengthsandweaknessesforbiologiststoconsider
AT osiermichaelv comparisonofshortreadsequencealignersindicatesstrengthsandweaknessesforbiologiststoconsider