Cargando…

The Impact of DNA Polymerase and Number of Rounds of Amplification in PCR on 16S rRNA Gene Sequence Data

PCR amplification of 16S rRNA genes is a critical yet underappreciated step in the generation of sequence data to describe the taxonomic composition of microbial communities. Numerous factors in the design of PCR can impact the sequencing error rate, the abundance of chimeric sequences, and the degr...

Descripción completa

Detalles Bibliográficos
Autores principales: Sze, Marc A., Schloss, Patrick D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6531881/
https://www.ncbi.nlm.nih.gov/pubmed/31118299
http://dx.doi.org/10.1128/mSphere.00163-19
_version_ 1783420897301889024
author Sze, Marc A.
Schloss, Patrick D.
author_facet Sze, Marc A.
Schloss, Patrick D.
author_sort Sze, Marc A.
collection PubMed
description PCR amplification of 16S rRNA genes is a critical yet underappreciated step in the generation of sequence data to describe the taxonomic composition of microbial communities. Numerous factors in the design of PCR can impact the sequencing error rate, the abundance of chimeric sequences, and the degree to which the fragments in the product represent their abundance in the original sample (i.e., bias). We compared the performance of high fidelity polymerases and various numbers of rounds of amplification when amplifying a mock community and human stool samples. Although it was impossible to derive specific recommendations, we did observe general trends. Namely, using a polymerase with the highest possible fidelity and minimizing the number of rounds of PCR reduced the sequencing error rate, fraction of chimeric sequences, and bias. Evidence of bias at the sequence level was subtle and could not be ascribed to the fragments’ fraction of bases that were guanines or cytosines. When analyzing mock community data, the amount that the community deviated from the expected composition increased with the number of rounds of PCR. This bias was inconsistent for human stool samples. Overall, the results underscore the difficulty of comparing sequence data that are generated by different PCR protocols. However, the results indicate that the variation in human stool samples is generally larger than that introduced by the choice of polymerase or number of rounds of PCR. IMPORTANCE A steep decline in sequencing costs drove an explosion in studies characterizing microbial communities from diverse environments. Although a significant amount of effort has gone into understanding the error profiles of DNA sequencers, little has been done to understand the downstream effects of the PCR amplification protocol. We quantified the effects of the choice of polymerase and number of PCR cycles on the quality of downstream data. We found that these choices can have a profound impact on the way that a microbial community is represented in the sequence data. The effects are relatively small compared to the variation in human stool samples; however, care should be taken to use polymerases with the highest possible fidelity and to minimize the number of rounds of PCR. These results also underscore that it is not possible to directly compare sequence data generated under different PCR conditions.
format Online
Article
Text
id pubmed-6531881
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-65318812019-05-28 The Impact of DNA Polymerase and Number of Rounds of Amplification in PCR on 16S rRNA Gene Sequence Data Sze, Marc A. Schloss, Patrick D. mSphere Research Article PCR amplification of 16S rRNA genes is a critical yet underappreciated step in the generation of sequence data to describe the taxonomic composition of microbial communities. Numerous factors in the design of PCR can impact the sequencing error rate, the abundance of chimeric sequences, and the degree to which the fragments in the product represent their abundance in the original sample (i.e., bias). We compared the performance of high fidelity polymerases and various numbers of rounds of amplification when amplifying a mock community and human stool samples. Although it was impossible to derive specific recommendations, we did observe general trends. Namely, using a polymerase with the highest possible fidelity and minimizing the number of rounds of PCR reduced the sequencing error rate, fraction of chimeric sequences, and bias. Evidence of bias at the sequence level was subtle and could not be ascribed to the fragments’ fraction of bases that were guanines or cytosines. When analyzing mock community data, the amount that the community deviated from the expected composition increased with the number of rounds of PCR. This bias was inconsistent for human stool samples. Overall, the results underscore the difficulty of comparing sequence data that are generated by different PCR protocols. However, the results indicate that the variation in human stool samples is generally larger than that introduced by the choice of polymerase or number of rounds of PCR. IMPORTANCE A steep decline in sequencing costs drove an explosion in studies characterizing microbial communities from diverse environments. Although a significant amount of effort has gone into understanding the error profiles of DNA sequencers, little has been done to understand the downstream effects of the PCR amplification protocol. We quantified the effects of the choice of polymerase and number of PCR cycles on the quality of downstream data. We found that these choices can have a profound impact on the way that a microbial community is represented in the sequence data. The effects are relatively small compared to the variation in human stool samples; however, care should be taken to use polymerases with the highest possible fidelity and to minimize the number of rounds of PCR. These results also underscore that it is not possible to directly compare sequence data generated under different PCR conditions. American Society for Microbiology 2019-05-22 /pmc/articles/PMC6531881/ /pubmed/31118299 http://dx.doi.org/10.1128/mSphere.00163-19 Text en Copyright © 2019 Sze and Schloss. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Sze, Marc A.
Schloss, Patrick D.
The Impact of DNA Polymerase and Number of Rounds of Amplification in PCR on 16S rRNA Gene Sequence Data
title The Impact of DNA Polymerase and Number of Rounds of Amplification in PCR on 16S rRNA Gene Sequence Data
title_full The Impact of DNA Polymerase and Number of Rounds of Amplification in PCR on 16S rRNA Gene Sequence Data
title_fullStr The Impact of DNA Polymerase and Number of Rounds of Amplification in PCR on 16S rRNA Gene Sequence Data
title_full_unstemmed The Impact of DNA Polymerase and Number of Rounds of Amplification in PCR on 16S rRNA Gene Sequence Data
title_short The Impact of DNA Polymerase and Number of Rounds of Amplification in PCR on 16S rRNA Gene Sequence Data
title_sort impact of dna polymerase and number of rounds of amplification in pcr on 16s rrna gene sequence data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6531881/
https://www.ncbi.nlm.nih.gov/pubmed/31118299
http://dx.doi.org/10.1128/mSphere.00163-19
work_keys_str_mv AT szemarca theimpactofdnapolymeraseandnumberofroundsofamplificationinpcron16srrnagenesequencedata
AT schlosspatrickd theimpactofdnapolymeraseandnumberofroundsofamplificationinpcron16srrnagenesequencedata
AT szemarca impactofdnapolymeraseandnumberofroundsofamplificationinpcron16srrnagenesequencedata
AT schlosspatrickd impactofdnapolymeraseandnumberofroundsofamplificationinpcron16srrnagenesequencedata