Cargando…
A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing
Microbial community profiling using 16S rRNA amplicon sequencing allows for taxonomic characterization of diverse microorganisms. While amplicon sequence variant (ASV) methods are increasingly favored for their fine-grained resolution of sequence variants, they often discard substantial portions of...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korea Genome Organization
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584646/ https://www.ncbi.nlm.nih.gov/pubmed/37813636 http://dx.doi.org/10.5808/gi.23044 |
_version_ | 1785122783229902848 |
---|---|
author | Kim, Hyeonwoo Kim, Jiwon Choi, Ji Won Ahn, Kwang-Sung Park, Dong-Il Kim, Sangsoo |
author_facet | Kim, Hyeonwoo Kim, Jiwon Choi, Ji Won Ahn, Kwang-Sung Park, Dong-Il Kim, Sangsoo |
author_sort | Kim, Hyeonwoo |
collection | PubMed |
description | Microbial community profiling using 16S rRNA amplicon sequencing allows for taxonomic characterization of diverse microorganisms. While amplicon sequence variant (ASV) methods are increasingly favored for their fine-grained resolution of sequence variants, they often discard substantial portions of sequencing reads during quality control, particularly in datasets with large number samples. We present a streamlined pipeline that integrates FastP for read trimming, HmmUFOtu for operational taxonomic units (OTU) clustering, Vsearch for chimera checking, and Kraken2 for taxonomic assignment. To assess the pipeline’s performance, we reprocessed two published stool datasets of normal Korean populations: one with 890 and the other with 1,462 independent samples. In the first dataset, HmmUFOtu retained 93.2% of over 104 million read pairs after quality trimming, discarding chimeric or unclassifiable reads, while DADA2, a commonly used ASV method, retained only 44.6% of the reads. Nonetheless, both methods yielded qualitatively similar β-diversity plots. For the second dataset, HmmUFOtu retained 89.2% of read pairs, while DADA2 retained a mere 18.4% of the reads. HmmUFOtu, being a closed-reference clustering method, facilitates merging separately processed datasets, with shared OTUs between the two datasets exhibiting a correlation coefficient of 0.92 in total abundance (log scale). While the first two dimensions of the β-diversity plot exhibited a cohesive mixture of the two datasets, the third dimension revealed the presence of a batch effect. Our comparative evaluation of ASV and OTU methods within this streamlined pipeline provides valuable insights into their performance when processing large-scale microbial 16S rRNA amplicon sequencing data. The strengths of HmmUFOtu and its potential for dataset merging are highlighted. |
format | Online Article Text |
id | pubmed-10584646 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Korea Genome Organization |
record_format | MEDLINE/PubMed |
spelling | pubmed-105846462023-10-20 A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing Kim, Hyeonwoo Kim, Jiwon Choi, Ji Won Ahn, Kwang-Sung Park, Dong-Il Kim, Sangsoo Genomics Inform Original Article Microbial community profiling using 16S rRNA amplicon sequencing allows for taxonomic characterization of diverse microorganisms. While amplicon sequence variant (ASV) methods are increasingly favored for their fine-grained resolution of sequence variants, they often discard substantial portions of sequencing reads during quality control, particularly in datasets with large number samples. We present a streamlined pipeline that integrates FastP for read trimming, HmmUFOtu for operational taxonomic units (OTU) clustering, Vsearch for chimera checking, and Kraken2 for taxonomic assignment. To assess the pipeline’s performance, we reprocessed two published stool datasets of normal Korean populations: one with 890 and the other with 1,462 independent samples. In the first dataset, HmmUFOtu retained 93.2% of over 104 million read pairs after quality trimming, discarding chimeric or unclassifiable reads, while DADA2, a commonly used ASV method, retained only 44.6% of the reads. Nonetheless, both methods yielded qualitatively similar β-diversity plots. For the second dataset, HmmUFOtu retained 89.2% of read pairs, while DADA2 retained a mere 18.4% of the reads. HmmUFOtu, being a closed-reference clustering method, facilitates merging separately processed datasets, with shared OTUs between the two datasets exhibiting a correlation coefficient of 0.92 in total abundance (log scale). While the first two dimensions of the β-diversity plot exhibited a cohesive mixture of the two datasets, the third dimension revealed the presence of a batch effect. Our comparative evaluation of ASV and OTU methods within this streamlined pipeline provides valuable insights into their performance when processing large-scale microbial 16S rRNA amplicon sequencing data. The strengths of HmmUFOtu and its potential for dataset merging are highlighted. Korea Genome Organization 2023-07-31 /pmc/articles/PMC10584646/ /pubmed/37813636 http://dx.doi.org/10.5808/gi.23044 Text en (c) 2023, Korea Genome Organization https://creativecommons.org/licenses/by/4.0/(CC) This is an open-access article distributed under the terms of the Creative Commons Attribution license(https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Kim, Hyeonwoo Kim, Jiwon Choi, Ji Won Ahn, Kwang-Sung Park, Dong-Il Kim, Sangsoo A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing |
title | A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing |
title_full | A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing |
title_fullStr | A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing |
title_full_unstemmed | A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing |
title_short | A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing |
title_sort | streamlined pipeline based on hmmufotu for microbial community profiling using 16s rrna amplicon sequencing |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584646/ https://www.ncbi.nlm.nih.gov/pubmed/37813636 http://dx.doi.org/10.5808/gi.23044 |
work_keys_str_mv | AT kimhyeonwoo astreamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT kimjiwon astreamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT choijiwon astreamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT ahnkwangsung astreamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT parkdongil astreamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT kimsangsoo astreamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT kimhyeonwoo streamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT kimjiwon streamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT choijiwon streamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT ahnkwangsung streamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT parkdongil streamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing AT kimsangsoo streamlinedpipelinebasedonhmmufotuformicrobialcommunityprofilingusing16srrnaampliconsequencing |