Cargando…

Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags

BACKGROUND: Illumina paired-end sequencing has been increasingly popular for 16S rRNA gene-based microbiota profiling. It provides higher phylogenetic resolution than single-end reads due to a longer read length. However, the reverse read (R2) often has significant low base quality, and a large prop...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Xianfeng, Johnson, Stephen, Jeraldo, Patricio, Wang, Junwen, Chia, Nicholas, Kocher, Jean-Pierre A, Chen, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5841375/
https://www.ncbi.nlm.nih.gov/pubmed/29267858
http://dx.doi.org/10.1093/gigascience/gix129
_version_ 1783304741618450432
author Chen, Xianfeng
Johnson, Stephen
Jeraldo, Patricio
Wang, Junwen
Chia, Nicholas
Kocher, Jean-Pierre A
Chen, Jun
author_facet Chen, Xianfeng
Johnson, Stephen
Jeraldo, Patricio
Wang, Junwen
Chia, Nicholas
Kocher, Jean-Pierre A
Chen, Jun
author_sort Chen, Xianfeng
collection PubMed
description BACKGROUND: Illumina paired-end sequencing has been increasingly popular for 16S rRNA gene-based microbiota profiling. It provides higher phylogenetic resolution than single-end reads due to a longer read length. However, the reverse read (R2) often has significant low base quality, and a large proportion of R2s will be discarded after quality control, resulting in a mixture of paired-end and single-end reads. A typical 16S analysis pipeline usually processes either paired-end or single-end reads but not a mixture. Thus, the quantification accuracy and statistical power will be reduced due to the loss of a large amount of reads. As a result, rare taxa may not be detectable with the paired-end approach, or low taxonomic resolution will result in a single-end approach. RESULTS: To have both the higher phylogenetic resolution provided by paired-end reads and the higher sequence coverage by single-end reads, we propose a novel OTU-picking pipeline, hybrid-denovo, that can process a hybrid of single-end and paired-end reads. Using high-quality paired-end reads as a gold standard, we show that hybrid-denovo achieved the highest correlation with the gold standard and performed better than the approaches based on paired-end or single-end reads in terms of quantifying the microbial diversity and taxonomic abundances. By applying our method to a rheumatoid arthritis (RA) data set, we demonstrated that hybrid-denovo captured more microbial diversity and identified more RA-associated taxa than a paired-end or single-end approach. CONCLUSIONS: Hybrid-denovo utilizes both paired-end and single-end 16S sequencing reads and is recommended for 16S rRNA gene targeted paired-end sequencing data.
format Online
Article
Text
id pubmed-5841375
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58413752018-03-28 Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags Chen, Xianfeng Johnson, Stephen Jeraldo, Patricio Wang, Junwen Chia, Nicholas Kocher, Jean-Pierre A Chen, Jun Gigascience Technical Note BACKGROUND: Illumina paired-end sequencing has been increasingly popular for 16S rRNA gene-based microbiota profiling. It provides higher phylogenetic resolution than single-end reads due to a longer read length. However, the reverse read (R2) often has significant low base quality, and a large proportion of R2s will be discarded after quality control, resulting in a mixture of paired-end and single-end reads. A typical 16S analysis pipeline usually processes either paired-end or single-end reads but not a mixture. Thus, the quantification accuracy and statistical power will be reduced due to the loss of a large amount of reads. As a result, rare taxa may not be detectable with the paired-end approach, or low taxonomic resolution will result in a single-end approach. RESULTS: To have both the higher phylogenetic resolution provided by paired-end reads and the higher sequence coverage by single-end reads, we propose a novel OTU-picking pipeline, hybrid-denovo, that can process a hybrid of single-end and paired-end reads. Using high-quality paired-end reads as a gold standard, we show that hybrid-denovo achieved the highest correlation with the gold standard and performed better than the approaches based on paired-end or single-end reads in terms of quantifying the microbial diversity and taxonomic abundances. By applying our method to a rheumatoid arthritis (RA) data set, we demonstrated that hybrid-denovo captured more microbial diversity and identified more RA-associated taxa than a paired-end or single-end approach. CONCLUSIONS: Hybrid-denovo utilizes both paired-end and single-end 16S sequencing reads and is recommended for 16S rRNA gene targeted paired-end sequencing data. Oxford University Press 2017-12-15 /pmc/articles/PMC5841375/ /pubmed/29267858 http://dx.doi.org/10.1093/gigascience/gix129 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Chen, Xianfeng
Johnson, Stephen
Jeraldo, Patricio
Wang, Junwen
Chia, Nicholas
Kocher, Jean-Pierre A
Chen, Jun
Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags
title Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags
title_full Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags
title_fullStr Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags
title_full_unstemmed Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags
title_short Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags
title_sort hybrid-denovo: a de novo otu-picking pipeline integrating single-end and paired-end 16s sequence tags
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5841375/
https://www.ncbi.nlm.nih.gov/pubmed/29267858
http://dx.doi.org/10.1093/gigascience/gix129
work_keys_str_mv AT chenxianfeng hybriddenovoadenovootupickingpipelineintegratingsingleendandpairedend16ssequencetags
AT johnsonstephen hybriddenovoadenovootupickingpipelineintegratingsingleendandpairedend16ssequencetags
AT jeraldopatricio hybriddenovoadenovootupickingpipelineintegratingsingleendandpairedend16ssequencetags
AT wangjunwen hybriddenovoadenovootupickingpipelineintegratingsingleendandpairedend16ssequencetags
AT chianicholas hybriddenovoadenovootupickingpipelineintegratingsingleendandpairedend16ssequencetags
AT kocherjeanpierrea hybriddenovoadenovootupickingpipelineintegratingsingleendandpairedend16ssequencetags
AT chenjun hybriddenovoadenovootupickingpipelineintegratingsingleendandpairedend16ssequencetags