Cargando…
Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider
Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of comm...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8087178/ https://www.ncbi.nlm.nih.gov/pubmed/33936141 http://dx.doi.org/10.3389/fpls.2021.657240 |
_version_ | 1783686625516060672 |
---|---|
author | Musich, Ryan Cadle-Davidson, Lance Osier, Michael V. |
author_facet | Musich, Ryan Cadle-Davidson, Lance Osier, Michael V. |
author_sort | Musich, Ryan |
collection | PubMed |
description | Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of common algorithms and programs in a way that should be approachable to biologists with limited experience in bioinformatics. We will only in passing consider the effects of data cleanup, a precursor analysis to most alignment tools, and no consideration will be given to downstream processing of the aligned fragments. To compare aligners [Bowtie2, Burrows Wheeler Aligner (BWA), HISAT2, MUMmer4, STAR, and TopHat2], an RNA-seq dataset was used containing data from 48 geographically distinct samples of the grapevine powdery mildew fungus Erysiphe necator. Based on alignment rate and gene coverage, all aligners performed well with the exception of TopHat2, which HISAT2 superseded. BWA perhaps had the best performance in these metrics, except for longer transcripts (>500 bp) for which HISAT2 and STAR performed well. HISAT2 was ~3-fold faster than the next fastest aligner in runtime, which we consider a secondary factor in most alignments. At the end, this direct comparison of commonly used aligners illustrates key considerations when choosing which tool to use for the specific sequencing data and objectives. No single tool meets all needs for every user, and there are many quality aligners available. |
format | Online Article Text |
id | pubmed-8087178 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-80871782021-05-01 Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider Musich, Ryan Cadle-Davidson, Lance Osier, Michael V. Front Plant Sci Plant Science Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of common algorithms and programs in a way that should be approachable to biologists with limited experience in bioinformatics. We will only in passing consider the effects of data cleanup, a precursor analysis to most alignment tools, and no consideration will be given to downstream processing of the aligned fragments. To compare aligners [Bowtie2, Burrows Wheeler Aligner (BWA), HISAT2, MUMmer4, STAR, and TopHat2], an RNA-seq dataset was used containing data from 48 geographically distinct samples of the grapevine powdery mildew fungus Erysiphe necator. Based on alignment rate and gene coverage, all aligners performed well with the exception of TopHat2, which HISAT2 superseded. BWA perhaps had the best performance in these metrics, except for longer transcripts (>500 bp) for which HISAT2 and STAR performed well. HISAT2 was ~3-fold faster than the next fastest aligner in runtime, which we consider a secondary factor in most alignments. At the end, this direct comparison of commonly used aligners illustrates key considerations when choosing which tool to use for the specific sequencing data and objectives. No single tool meets all needs for every user, and there are many quality aligners available. Frontiers Media S.A. 2021-04-16 /pmc/articles/PMC8087178/ /pubmed/33936141 http://dx.doi.org/10.3389/fpls.2021.657240 Text en Copyright © 2021 Musich, Cadle-Davidson and Osier. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Plant Science Musich, Ryan Cadle-Davidson, Lance Osier, Michael V. Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider |
title | Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider |
title_full | Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider |
title_fullStr | Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider |
title_full_unstemmed | Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider |
title_short | Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider |
title_sort | comparison of short-read sequence aligners indicates strengths and weaknesses for biologists to consider |
topic | Plant Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8087178/ https://www.ncbi.nlm.nih.gov/pubmed/33936141 http://dx.doi.org/10.3389/fpls.2021.657240 |
work_keys_str_mv | AT musichryan comparisonofshortreadsequencealignersindicatesstrengthsandweaknessesforbiologiststoconsider AT cadledavidsonlance comparisonofshortreadsequencealignersindicatesstrengthsandweaknessesforbiologiststoconsider AT osiermichaelv comparisonofshortreadsequencealignersindicatesstrengthsandweaknessesforbiologiststoconsider |