Cargando…

Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform

With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of appl...

Descripción completa

Detalles Bibliográficos
Autores principales: Schirmer, Melanie, Ijaz, Umer Z., D'Amore, Rosalinda, Hall, Neil, Sloan, William T., Quince, Christopher
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4381044/
https://www.ncbi.nlm.nih.gov/pubmed/25586220
http://dx.doi.org/10.1093/nar/gku1341
_version_ 1782364385074741248
author Schirmer, Melanie
Ijaz, Umer Z.
D'Amore, Rosalinda
Hall, Neil
Sloan, William T.
Quince, Christopher
author_facet Schirmer, Melanie
Ijaz, Umer Z.
D'Amore, Rosalinda
Hall, Neil
Sloan, William T.
Quince, Christopher
author_sort Schirmer, Melanie
collection PubMed
description With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%.
format Online
Article
Text
id pubmed-4381044
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-43810442015-04-03 Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform Schirmer, Melanie Ijaz, Umer Z. D'Amore, Rosalinda Hall, Neil Sloan, William T. Quince, Christopher Nucleic Acids Res Methods Online With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. Oxford University Press 2015-03-31 2015-01-13 /pmc/articles/PMC4381044/ /pubmed/25586220 http://dx.doi.org/10.1093/nar/gku1341 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Schirmer, Melanie
Ijaz, Umer Z.
D'Amore, Rosalinda
Hall, Neil
Sloan, William T.
Quince, Christopher
Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform
title Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform
title_full Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform
title_fullStr Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform
title_full_unstemmed Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform
title_short Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform
title_sort insight into biases and sequencing errors for amplicon sequencing with the illumina miseq platform
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4381044/
https://www.ncbi.nlm.nih.gov/pubmed/25586220
http://dx.doi.org/10.1093/nar/gku1341
work_keys_str_mv AT schirmermelanie insightintobiasesandsequencingerrorsforampliconsequencingwiththeilluminamiseqplatform
AT ijazumerz insightintobiasesandsequencingerrorsforampliconsequencingwiththeilluminamiseqplatform
AT damorerosalinda insightintobiasesandsequencingerrorsforampliconsequencingwiththeilluminamiseqplatform
AT hallneil insightintobiasesandsequencingerrorsforampliconsequencingwiththeilluminamiseqplatform
AT sloanwilliamt insightintobiasesandsequencingerrorsforampliconsequencingwiththeilluminamiseqplatform
AT quincechristopher insightintobiasesandsequencingerrorsforampliconsequencingwiththeilluminamiseqplatform