Cargando…

StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level

The emergence of next-generation sequencing (NGS) technology has greatly influenced microbiome research and led to the development of novel bioinformatics tools to deeply analyze metagenomics datasets. Identifying strain-level variations in microbial communities is important to understanding the ons...

Descripción completa

Detalles Bibliográficos
Autores principales: Pandey, Sanjit, Avuthu, Nagavardhini, Guda, Chittibabu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10454763/
https://www.ncbi.nlm.nih.gov/pubmed/37628698
http://dx.doi.org/10.3390/genes14081647
_version_ 1785096277424340992
author Pandey, Sanjit
Avuthu, Nagavardhini
Guda, Chittibabu
author_facet Pandey, Sanjit
Avuthu, Nagavardhini
Guda, Chittibabu
author_sort Pandey, Sanjit
collection PubMed
description The emergence of next-generation sequencing (NGS) technology has greatly influenced microbiome research and led to the development of novel bioinformatics tools to deeply analyze metagenomics datasets. Identifying strain-level variations in microbial communities is important to understanding the onset and progression of diseases, host–pathogen interrelationships, and drug resistance, in addition to designing new therapeutic regimens. In this study, we developed a novel tool called StrainIQ (strain identification and quantification) based on a new n-gram-based (series of n number of adjacent nucleotides in the DNA sequence) algorithm for predicting and quantifying strain-level taxa from whole-genome metagenomic sequencing data. We thoroughly evaluated our method using simulated and mock metagenomic datasets and compared its performance with existing methods. On average, it showed 85.8% sensitivity and 78.2% specificity on simulated datasets. It also showed higher specificity and sensitivity using n-gram models built from reduced reference genomes and on models with lower coverage sequencing data. It outperforms alternative approaches in genus- and strain-level prediction and strain abundance estimation. Overall, the results show that StrainIQ achieves high accuracy by implementing customized model-building and is an efficient tool for site-specific microbial community profiling.
format Online
Article
Text
id pubmed-10454763
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-104547632023-08-26 StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level Pandey, Sanjit Avuthu, Nagavardhini Guda, Chittibabu Genes (Basel) Article The emergence of next-generation sequencing (NGS) technology has greatly influenced microbiome research and led to the development of novel bioinformatics tools to deeply analyze metagenomics datasets. Identifying strain-level variations in microbial communities is important to understanding the onset and progression of diseases, host–pathogen interrelationships, and drug resistance, in addition to designing new therapeutic regimens. In this study, we developed a novel tool called StrainIQ (strain identification and quantification) based on a new n-gram-based (series of n number of adjacent nucleotides in the DNA sequence) algorithm for predicting and quantifying strain-level taxa from whole-genome metagenomic sequencing data. We thoroughly evaluated our method using simulated and mock metagenomic datasets and compared its performance with existing methods. On average, it showed 85.8% sensitivity and 78.2% specificity on simulated datasets. It also showed higher specificity and sensitivity using n-gram models built from reduced reference genomes and on models with lower coverage sequencing data. It outperforms alternative approaches in genus- and strain-level prediction and strain abundance estimation. Overall, the results show that StrainIQ achieves high accuracy by implementing customized model-building and is an efficient tool for site-specific microbial community profiling. MDPI 2023-08-18 /pmc/articles/PMC10454763/ /pubmed/37628698 http://dx.doi.org/10.3390/genes14081647 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Pandey, Sanjit
Avuthu, Nagavardhini
Guda, Chittibabu
StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level
title StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level
title_full StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level
title_fullStr StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level
title_full_unstemmed StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level
title_short StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level
title_sort strainiq: a novel n-gram-based method for taxonomic profiling of human microbiota at the strain level
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10454763/
https://www.ncbi.nlm.nih.gov/pubmed/37628698
http://dx.doi.org/10.3390/genes14081647
work_keys_str_mv AT pandeysanjit strainiqanovelngrambasedmethodfortaxonomicprofilingofhumanmicrobiotaatthestrainlevel
AT avuthunagavardhini strainiqanovelngrambasedmethodfortaxonomicprofilingofhumanmicrobiotaatthestrainlevel
AT gudachittibabu strainiqanovelngrambasedmethodfortaxonomicprofilingofhumanmicrobiotaatthestrainlevel