Cargando…

Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning

Small proteins, encoded by small open reading frames, are only beginning to emerge with the current advancement of omics technology and bioinformatics. There is increasing evidence that small proteins play roles in diverse critical biological functions, such as adjusting cellular metabolism, regulat...

Descripción completa

Detalles Bibliográficos
Autores principales: Vajjala, Mitra, Johnson, Brady, Kasparek, Lauren, Leuze, Michael, Yao, Qiuming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9354662/
https://www.ncbi.nlm.nih.gov/pubmed/35938008
http://dx.doi.org/10.3389/fgene.2022.935351
_version_ 1784763121488887808
author Vajjala, Mitra
Johnson, Brady
Kasparek, Lauren
Leuze, Michael
Yao, Qiuming
author_facet Vajjala, Mitra
Johnson, Brady
Kasparek, Lauren
Leuze, Michael
Yao, Qiuming
author_sort Vajjala, Mitra
collection PubMed
description Small proteins, encoded by small open reading frames, are only beginning to emerge with the current advancement of omics technology and bioinformatics. There is increasing evidence that small proteins play roles in diverse critical biological functions, such as adjusting cellular metabolism, regulating other protein activities, controlling cell cycles, and affecting disease physiology. In prokaryotes such as bacteria, the small proteins are largely unexplored for their sequence space and functional groups. For most bacterial species from a natural community, the sample cannot be easily isolated or cultured, and the bacterial peptides must be better characterized in a metagenomic manner. The bacterial peptides identified from metagenomic samples can not only enrich the pool of small proteins but can also reveal the community-specific microbe ecology information from a small protein perspective. In this study, metaBP (Bacterial Peptides for metagenomic sample) has been developed as a comprehensive toolkit to explore the small protein universe from metagenomic samples. It takes raw sequencing reads as input, performs protein-level meta-assembly, and computes bacterial peptide homolog groups with sample-specific mutations. The metaBP also integrates general protein annotation tools as well as our small protein-specific machine learning module metaBP-ML to construct a full landscape for bacterial peptides. The metaBP-ML shows advantages for discovering functions of bacterial peptides in a microbial community and increases the yields of annotations by up to five folds. The metaBP toolkit demonstrates its novelty in adopting the protein-level assembly to discover small proteins, integrating protein-clustering tool in a new and flexible environment of RBiotools, and presenting the first-time small protein landscape by metaBP-ML. Taken together, metaBP (and metaBP-ML) can profile functional bacterial peptides from metagenomic samples with potential diverse mutations, in order to depict a unique landscape of small proteins from a microbial community.
format Online
Article
Text
id pubmed-9354662
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-93546622022-08-06 Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning Vajjala, Mitra Johnson, Brady Kasparek, Lauren Leuze, Michael Yao, Qiuming Front Genet Genetics Small proteins, encoded by small open reading frames, are only beginning to emerge with the current advancement of omics technology and bioinformatics. There is increasing evidence that small proteins play roles in diverse critical biological functions, such as adjusting cellular metabolism, regulating other protein activities, controlling cell cycles, and affecting disease physiology. In prokaryotes such as bacteria, the small proteins are largely unexplored for their sequence space and functional groups. For most bacterial species from a natural community, the sample cannot be easily isolated or cultured, and the bacterial peptides must be better characterized in a metagenomic manner. The bacterial peptides identified from metagenomic samples can not only enrich the pool of small proteins but can also reveal the community-specific microbe ecology information from a small protein perspective. In this study, metaBP (Bacterial Peptides for metagenomic sample) has been developed as a comprehensive toolkit to explore the small protein universe from metagenomic samples. It takes raw sequencing reads as input, performs protein-level meta-assembly, and computes bacterial peptide homolog groups with sample-specific mutations. The metaBP also integrates general protein annotation tools as well as our small protein-specific machine learning module metaBP-ML to construct a full landscape for bacterial peptides. The metaBP-ML shows advantages for discovering functions of bacterial peptides in a microbial community and increases the yields of annotations by up to five folds. The metaBP toolkit demonstrates its novelty in adopting the protein-level assembly to discover small proteins, integrating protein-clustering tool in a new and flexible environment of RBiotools, and presenting the first-time small protein landscape by metaBP-ML. Taken together, metaBP (and metaBP-ML) can profile functional bacterial peptides from metagenomic samples with potential diverse mutations, in order to depict a unique landscape of small proteins from a microbial community. Frontiers Media S.A. 2022-07-22 /pmc/articles/PMC9354662/ /pubmed/35938008 http://dx.doi.org/10.3389/fgene.2022.935351 Text en Copyright © 2022 Vajjala, Johnson, Kasparek, Leuze and Yao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Vajjala, Mitra
Johnson, Brady
Kasparek, Lauren
Leuze, Michael
Yao, Qiuming
Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning
title Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning
title_full Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning
title_fullStr Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning
title_full_unstemmed Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning
title_short Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning
title_sort profiling a community-specific function landscape for bacterial peptides through protein-level meta-assembly and machine learning
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9354662/
https://www.ncbi.nlm.nih.gov/pubmed/35938008
http://dx.doi.org/10.3389/fgene.2022.935351
work_keys_str_mv AT vajjalamitra profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning
AT johnsonbrady profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning
AT kaspareklauren profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning
AT leuzemichael profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning
AT yaoqiuming profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning