Cargando…
Strain level microbial detection and quantification with applications to single cell metagenomics
Computational identification and quantification of distinct microbes from high throughput sequencing data is crucial for our understanding of human health. Existing methods either use accurate but computationally expensive alignment-based approaches or less accurate but computationally fast alignmen...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9616933/ https://www.ncbi.nlm.nih.gov/pubmed/36307411 http://dx.doi.org/10.1038/s41467-022-33869-7 |
_version_ | 1784820747524374528 |
---|---|
author | Zhu, Kaiyuan Schäffer, Alejandro A. Robinson, Welles Xu, Junyan Ruppin, Eytan Ergun, A. Funda Ye, Yuzhen Sahinalp, S. Cenk |
author_facet | Zhu, Kaiyuan Schäffer, Alejandro A. Robinson, Welles Xu, Junyan Ruppin, Eytan Ergun, A. Funda Ye, Yuzhen Sahinalp, S. Cenk |
author_sort | Zhu, Kaiyuan |
collection | PubMed |
description | Computational identification and quantification of distinct microbes from high throughput sequencing data is crucial for our understanding of human health. Existing methods either use accurate but computationally expensive alignment-based approaches or less accurate but computationally fast alignment-free approaches, which often fail to correctly assign reads to genomes. Here we introduce CAMMiQ, a combinatorial optimization framework to identify and quantify distinct genomes (specified by a database) in a metagenomic dataset. As a key methodological innovation, CAMMiQ uses substrings of variable length and those that appear in two genomes in the database, as opposed to the commonly used fixed-length, unique substrings. These substrings allow to accurately decouple mixtures of highly similar genomes resulting in higher accuracy than the leading alternatives, without requiring additional computational resources, as demonstrated on commonly used benchmarking datasets. Importantly, we show that CAMMiQ can distinguish closely related bacterial strains in simulated metagenomic and real single-cell metatranscriptomic data. |
format | Online Article Text |
id | pubmed-9616933 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-96169332022-10-30 Strain level microbial detection and quantification with applications to single cell metagenomics Zhu, Kaiyuan Schäffer, Alejandro A. Robinson, Welles Xu, Junyan Ruppin, Eytan Ergun, A. Funda Ye, Yuzhen Sahinalp, S. Cenk Nat Commun Article Computational identification and quantification of distinct microbes from high throughput sequencing data is crucial for our understanding of human health. Existing methods either use accurate but computationally expensive alignment-based approaches or less accurate but computationally fast alignment-free approaches, which often fail to correctly assign reads to genomes. Here we introduce CAMMiQ, a combinatorial optimization framework to identify and quantify distinct genomes (specified by a database) in a metagenomic dataset. As a key methodological innovation, CAMMiQ uses substrings of variable length and those that appear in two genomes in the database, as opposed to the commonly used fixed-length, unique substrings. These substrings allow to accurately decouple mixtures of highly similar genomes resulting in higher accuracy than the leading alternatives, without requiring additional computational resources, as demonstrated on commonly used benchmarking datasets. Importantly, we show that CAMMiQ can distinguish closely related bacterial strains in simulated metagenomic and real single-cell metatranscriptomic data. Nature Publishing Group UK 2022-10-28 /pmc/articles/PMC9616933/ /pubmed/36307411 http://dx.doi.org/10.1038/s41467-022-33869-7 Text en © This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Zhu, Kaiyuan Schäffer, Alejandro A. Robinson, Welles Xu, Junyan Ruppin, Eytan Ergun, A. Funda Ye, Yuzhen Sahinalp, S. Cenk Strain level microbial detection and quantification with applications to single cell metagenomics |
title | Strain level microbial detection and quantification with applications to single cell metagenomics |
title_full | Strain level microbial detection and quantification with applications to single cell metagenomics |
title_fullStr | Strain level microbial detection and quantification with applications to single cell metagenomics |
title_full_unstemmed | Strain level microbial detection and quantification with applications to single cell metagenomics |
title_short | Strain level microbial detection and quantification with applications to single cell metagenomics |
title_sort | strain level microbial detection and quantification with applications to single cell metagenomics |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9616933/ https://www.ncbi.nlm.nih.gov/pubmed/36307411 http://dx.doi.org/10.1038/s41467-022-33869-7 |
work_keys_str_mv | AT zhukaiyuan strainlevelmicrobialdetectionandquantificationwithapplicationstosinglecellmetagenomics AT schafferalejandroa strainlevelmicrobialdetectionandquantificationwithapplicationstosinglecellmetagenomics AT robinsonwelles strainlevelmicrobialdetectionandquantificationwithapplicationstosinglecellmetagenomics AT xujunyan strainlevelmicrobialdetectionandquantificationwithapplicationstosinglecellmetagenomics AT ruppineytan strainlevelmicrobialdetectionandquantificationwithapplicationstosinglecellmetagenomics AT ergunafunda strainlevelmicrobialdetectionandquantificationwithapplicationstosinglecellmetagenomics AT yeyuzhen strainlevelmicrobialdetectionandquantificationwithapplicationstosinglecellmetagenomics AT sahinalpscenk strainlevelmicrobialdetectionandquantificationwithapplicationstosinglecellmetagenomics |