Cargando…

Sources of PCR-induced distortions in high-throughput sequencing data sets

PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kebschull, Justus M., Zador, Anthony M.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2015
Materias:	Methods Online
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4666380/ https://www.ncbi.nlm.nih.gov/pubmed/26187991 http://dx.doi.org/10.1093/nar/gkv717

_version_	1782403697904451584
author	Kebschull, Justus M. Zador, Anthony M.
author_facet	Kebschull, Justus M. Zador, Anthony M.
author_sort	Kebschull, Justus M.
collection	PubMed
description	PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules.
format	Online Article Text
id	pubmed-4666380
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-46663802015-12-02 Sources of PCR-induced distortions in high-throughput sequencing data sets Kebschull, Justus M. Zador, Anthony M. Nucleic Acids Res Methods Online PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules. Oxford University Press 2015-12-02 2015-07-17 /pmc/articles/PMC4666380/ /pubmed/26187991 http://dx.doi.org/10.1093/nar/gkv717 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methods Online Kebschull, Justus M. Zador, Anthony M. Sources of PCR-induced distortions in high-throughput sequencing data sets
title	Sources of PCR-induced distortions in high-throughput sequencing data sets
title_full	Sources of PCR-induced distortions in high-throughput sequencing data sets
title_fullStr	Sources of PCR-induced distortions in high-throughput sequencing data sets
title_full_unstemmed	Sources of PCR-induced distortions in high-throughput sequencing data sets
title_short	Sources of PCR-induced distortions in high-throughput sequencing data sets
title_sort	sources of pcr-induced distortions in high-throughput sequencing data sets
topic	Methods Online
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4666380/ https://www.ncbi.nlm.nih.gov/pubmed/26187991 http://dx.doi.org/10.1093/nar/gkv717
work_keys_str_mv	AT kebschulljustusm sourcesofpcrinduceddistortionsinhighthroughputsequencingdatasets AT zadoranthonym sourcesofpcrinduceddistortionsinhighthroughputsequencingdatasets

Sources of PCR-induced distortions in high-throughput sequencing data sets

Ejemplares similares