Cargando…

Fast and accurate short read alignment with Burrows–Wheeler transform

Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Heng, Durbin, Richard
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2705234/
https://www.ncbi.nlm.nih.gov/pubmed/19451168
http://dx.doi.org/10.1093/bioinformatics/btp324
_version_ 1782168973706526720
author Li, Heng
Durbin, Richard
author_facet Li, Heng
Durbin, Richard
author_sort Li, Heng
collection PubMed
description Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
format Text
id pubmed-2705234
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27052342009-07-06 Fast and accurate short read alignment with Burrows–Wheeler transform Li, Heng Durbin, Richard Bioinformatics Original Paper Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk Oxford University Press 2009-07-15 2009-05-18 /pmc/articles/PMC2705234/ /pubmed/19451168 http://dx.doi.org/10.1093/bioinformatics/btp324 Text en http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Li, Heng
Durbin, Richard
Fast and accurate short read alignment with Burrows–Wheeler transform
title Fast and accurate short read alignment with Burrows–Wheeler transform
title_full Fast and accurate short read alignment with Burrows–Wheeler transform
title_fullStr Fast and accurate short read alignment with Burrows–Wheeler transform
title_full_unstemmed Fast and accurate short read alignment with Burrows–Wheeler transform
title_short Fast and accurate short read alignment with Burrows–Wheeler transform
title_sort fast and accurate short read alignment with burrows–wheeler transform
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2705234/
https://www.ncbi.nlm.nih.gov/pubmed/19451168
http://dx.doi.org/10.1093/bioinformatics/btp324
work_keys_str_mv AT liheng fastandaccurateshortreadalignmentwithburrowswheelertransform
AT durbinrichard fastandaccurateshortreadalignmentwithburrowswheelertransform