Cargando…

Representing bacteria with unique genomic signatures

Classifying or identifying bacteria in metagenomic samples is an important problem in the analysis of metagenomic data. This task can be computationally expensive since microbial communities usually consist of hundreds to thousands of environmental microbial species. We proposed a new method for rep...

Descripción completa

Detalles Bibliográficos
Autores principales: Pham, Diem-Trang, Phan, Vinhthuy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9709128/
https://www.ncbi.nlm.nih.gov/pubmed/36466712
http://dx.doi.org/10.3389/fdata.2022.1018356
_version_ 1784841078072934400
author Pham, Diem-Trang
Phan, Vinhthuy
author_facet Pham, Diem-Trang
Phan, Vinhthuy
author_sort Pham, Diem-Trang
collection PubMed
description Classifying or identifying bacteria in metagenomic samples is an important problem in the analysis of metagenomic data. This task can be computationally expensive since microbial communities usually consist of hundreds to thousands of environmental microbial species. We proposed a new method for representing bacteria in a microbial community using genomic signatures of those bacteria. With respect to the microbial community, the genomic signatures of each bacterium are unique to that bacterium; they do not exist in other bacteria in the community. Further, since the genomic signatures of a bacterium are much smaller than its genome size, the approach allows for a compressed representation of the microbial community. This approach uses a modified Bloom filter to store short k-mers with hash values that are unique to each bacterium. We show that most bacteria in many microbiomes can be represented uniquely using the proposed genomic signatures. This approach paves the way toward new methods for classifying bacteria in metagenomic samples.
format Online
Article
Text
id pubmed-9709128
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-97091282022-12-01 Representing bacteria with unique genomic signatures Pham, Diem-Trang Phan, Vinhthuy Front Big Data Big Data Classifying or identifying bacteria in metagenomic samples is an important problem in the analysis of metagenomic data. This task can be computationally expensive since microbial communities usually consist of hundreds to thousands of environmental microbial species. We proposed a new method for representing bacteria in a microbial community using genomic signatures of those bacteria. With respect to the microbial community, the genomic signatures of each bacterium are unique to that bacterium; they do not exist in other bacteria in the community. Further, since the genomic signatures of a bacterium are much smaller than its genome size, the approach allows for a compressed representation of the microbial community. This approach uses a modified Bloom filter to store short k-mers with hash values that are unique to each bacterium. We show that most bacteria in many microbiomes can be represented uniquely using the proposed genomic signatures. This approach paves the way toward new methods for classifying bacteria in metagenomic samples. Frontiers Media S.A. 2022-11-16 /pmc/articles/PMC9709128/ /pubmed/36466712 http://dx.doi.org/10.3389/fdata.2022.1018356 Text en Copyright © 2022 Pham and Phan. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Pham, Diem-Trang
Phan, Vinhthuy
Representing bacteria with unique genomic signatures
title Representing bacteria with unique genomic signatures
title_full Representing bacteria with unique genomic signatures
title_fullStr Representing bacteria with unique genomic signatures
title_full_unstemmed Representing bacteria with unique genomic signatures
title_short Representing bacteria with unique genomic signatures
title_sort representing bacteria with unique genomic signatures
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9709128/
https://www.ncbi.nlm.nih.gov/pubmed/36466712
http://dx.doi.org/10.3389/fdata.2022.1018356
work_keys_str_mv AT phamdiemtrang representingbacteriawithuniquegenomicsignatures
AT phanvinhthuy representingbacteriawithuniquegenomicsignatures