Cargando…

Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis

BACKGROUND: The development of sequencing technologies to evaluate bacterial microbiota composition has allowed new insights into the importance of microbial ecology. However, the variety of methodologies used among amplicon sequencing workflows leads to uncertainty about best practices as well as r...

Descripción completa

Detalles Bibliográficos
Autores principales: De Wolfe, Travis J., Wright, Erik S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10114302/
https://www.ncbi.nlm.nih.gov/pubmed/37076812
http://dx.doi.org/10.1186/s12866-023-02851-8
_version_ 1785027987061604352
author De Wolfe, Travis J.
Wright, Erik S.
author_facet De Wolfe, Travis J.
Wright, Erik S.
author_sort De Wolfe, Travis J.
collection PubMed
description BACKGROUND: The development of sequencing technologies to evaluate bacterial microbiota composition has allowed new insights into the importance of microbial ecology. However, the variety of methodologies used among amplicon sequencing workflows leads to uncertainty about best practices as well as reproducibility and replicability among microbiome studies. Using a bacterial mock community composed of 37 soil isolates, we performed a comprehensive methodological evaluation of workflows, each with a different combination of methodological factors spanning sample preparation to bioinformatic analysis to define sources of artifacts that affect coverage, accuracy, and biases in the resulting compositional profiles. RESULTS: Of the workflows examined, those using the V4-V4 primer set enabled the highest level of concordance between the original mock community and resulting microbiome sequence composition. Use of a high-fidelity polymerase, or a lower-fidelity polymerase with an increased PCR elongation time, limited chimera formation. Bioinformatic pipelines presented a trade-off between the fraction of distinct community members identified (coverage) and fraction of correct sequences (accuracy). DADA2 and QIIME2 assembled V4-V4 reads amplified by Taq polymerase resulted in the highest accuracy (100%) but had a coverage of only 52%. Using mothur to assemble and denoise V4-V4 reads resulted in a coverage of 75%, albeit with marginally lower accuracy (99.5%). CONCLUSIONS: Optimization of microbiome workflows is critical for accuracy and to support reproducibility and replicability among microbiome studies. These considerations will help reveal the guiding principles of microbial ecology and impact the translation of microbiome research to human and environmental health. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-023-02851-8.
format Online
Article
Text
id pubmed-10114302
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-101143022023-04-20 Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis De Wolfe, Travis J. Wright, Erik S. BMC Microbiol Research BACKGROUND: The development of sequencing technologies to evaluate bacterial microbiota composition has allowed new insights into the importance of microbial ecology. However, the variety of methodologies used among amplicon sequencing workflows leads to uncertainty about best practices as well as reproducibility and replicability among microbiome studies. Using a bacterial mock community composed of 37 soil isolates, we performed a comprehensive methodological evaluation of workflows, each with a different combination of methodological factors spanning sample preparation to bioinformatic analysis to define sources of artifacts that affect coverage, accuracy, and biases in the resulting compositional profiles. RESULTS: Of the workflows examined, those using the V4-V4 primer set enabled the highest level of concordance between the original mock community and resulting microbiome sequence composition. Use of a high-fidelity polymerase, or a lower-fidelity polymerase with an increased PCR elongation time, limited chimera formation. Bioinformatic pipelines presented a trade-off between the fraction of distinct community members identified (coverage) and fraction of correct sequences (accuracy). DADA2 and QIIME2 assembled V4-V4 reads amplified by Taq polymerase resulted in the highest accuracy (100%) but had a coverage of only 52%. Using mothur to assemble and denoise V4-V4 reads resulted in a coverage of 75%, albeit with marginally lower accuracy (99.5%). CONCLUSIONS: Optimization of microbiome workflows is critical for accuracy and to support reproducibility and replicability among microbiome studies. These considerations will help reveal the guiding principles of microbial ecology and impact the translation of microbiome research to human and environmental health. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-023-02851-8. BioMed Central 2023-04-19 /pmc/articles/PMC10114302/ /pubmed/37076812 http://dx.doi.org/10.1186/s12866-023-02851-8 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
De Wolfe, Travis J.
Wright, Erik S.
Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis
title Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis
title_full Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis
title_fullStr Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis
title_full_unstemmed Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis
title_short Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis
title_sort multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10114302/
https://www.ncbi.nlm.nih.gov/pubmed/37076812
http://dx.doi.org/10.1186/s12866-023-02851-8
work_keys_str_mv AT dewolfetravisj multifactorialexaminationofampliconsequencingworkflowsfromsamplepreparationtobioinformaticanalysis
AT wrighteriks multifactorialexaminationofampliconsequencingworkflowsfromsamplepreparationtobioinformaticanalysis