Cargando…

Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics

[Image: see text] Metaproteomics has been increasingly utilized for high-throughput characterization of proteins in complex environments and has been demonstrated to provide insights into microbial composition and functional roles. However, significant challenges remain in metaproteomic data analysi...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Joon-Yong, Mitchell, Hugh D., Burnet, Meagan C., Wu, Ruonan, Jenson, Sarah C., Merkley, Eric D., Nakayasu, Ernesto S., Nicora, Carrie D., Jansson, Janet K., Burnum-Johnson, Kristin E., Payne, Samuel H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361346/
https://www.ncbi.nlm.nih.gov/pubmed/35793793
http://dx.doi.org/10.1021/acs.jproteome.2c00334
_version_ 1784764513151614976
author Lee, Joon-Yong
Mitchell, Hugh D.
Burnet, Meagan C.
Wu, Ruonan
Jenson, Sarah C.
Merkley, Eric D.
Nakayasu, Ernesto S.
Nicora, Carrie D.
Jansson, Janet K.
Burnum-Johnson, Kristin E.
Payne, Samuel H.
author_facet Lee, Joon-Yong
Mitchell, Hugh D.
Burnet, Meagan C.
Wu, Ruonan
Jenson, Sarah C.
Merkley, Eric D.
Nakayasu, Ernesto S.
Nicora, Carrie D.
Jansson, Janet K.
Burnum-Johnson, Kristin E.
Payne, Samuel H.
author_sort Lee, Joon-Yong
collection PubMed
description [Image: see text] Metaproteomics has been increasingly utilized for high-throughput characterization of proteins in complex environments and has been demonstrated to provide insights into microbial composition and functional roles. However, significant challenges remain in metaproteomic data analysis, including creation of a sample-specific protein sequence database. A well-matched database is a requirement for successful metaproteomics analysis, and the accuracy and sensitivity of PSM identification algorithms suffer when the database is incomplete or contains extraneous sequences. When matched DNA sequencing data of the sample is unavailable or incomplete, creating the proteome database that accurately represents the organisms in the sample is a challenge. Here, we leverage a de novo peptide sequencing approach to identify the sample composition directly from metaproteomic data. First, we created a deep learning model, Kaiko, to predict the peptide sequences from mass spectrometry data and trained it on 5 million peptide–spectrum matches from 55 phylogenetically diverse bacteria. After training, Kaiko successfully identified organisms from soil isolates and synthetic communities directly from proteomics data. Finally, we created a pipeline for metaproteome database generation using Kaiko. We tested the pipeline on native soils collected in Kansas, showing that the de novo sequencing model can be employed as an alternative and complementary method to construct the sample-specific protein database instead of relying on (un)matched metagenomes. Our pipeline identified all highly abundant taxa from 16S rRNA sequencing of the soil samples and uncovered several additional species which were strongly represented only in proteomic data.
format Online
Article
Text
id pubmed-9361346
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-93613462022-08-10 Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics Lee, Joon-Yong Mitchell, Hugh D. Burnet, Meagan C. Wu, Ruonan Jenson, Sarah C. Merkley, Eric D. Nakayasu, Ernesto S. Nicora, Carrie D. Jansson, Janet K. Burnum-Johnson, Kristin E. Payne, Samuel H. J Proteome Res [Image: see text] Metaproteomics has been increasingly utilized for high-throughput characterization of proteins in complex environments and has been demonstrated to provide insights into microbial composition and functional roles. However, significant challenges remain in metaproteomic data analysis, including creation of a sample-specific protein sequence database. A well-matched database is a requirement for successful metaproteomics analysis, and the accuracy and sensitivity of PSM identification algorithms suffer when the database is incomplete or contains extraneous sequences. When matched DNA sequencing data of the sample is unavailable or incomplete, creating the proteome database that accurately represents the organisms in the sample is a challenge. Here, we leverage a de novo peptide sequencing approach to identify the sample composition directly from metaproteomic data. First, we created a deep learning model, Kaiko, to predict the peptide sequences from mass spectrometry data and trained it on 5 million peptide–spectrum matches from 55 phylogenetically diverse bacteria. After training, Kaiko successfully identified organisms from soil isolates and synthetic communities directly from proteomics data. Finally, we created a pipeline for metaproteome database generation using Kaiko. We tested the pipeline on native soils collected in Kansas, showing that the de novo sequencing model can be employed as an alternative and complementary method to construct the sample-specific protein database instead of relying on (un)matched metagenomes. Our pipeline identified all highly abundant taxa from 16S rRNA sequencing of the soil samples and uncovered several additional species which were strongly represented only in proteomic data. American Chemical Society 2022-07-06 2022-08-05 /pmc/articles/PMC9361346/ /pubmed/35793793 http://dx.doi.org/10.1021/acs.jproteome.2c00334 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Lee, Joon-Yong
Mitchell, Hugh D.
Burnet, Meagan C.
Wu, Ruonan
Jenson, Sarah C.
Merkley, Eric D.
Nakayasu, Ernesto S.
Nicora, Carrie D.
Jansson, Janet K.
Burnum-Johnson, Kristin E.
Payne, Samuel H.
Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics
title Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics
title_full Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics
title_fullStr Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics
title_full_unstemmed Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics
title_short Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics
title_sort uncovering hidden members and functions of the soil microbiome using de novo metaproteomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361346/
https://www.ncbi.nlm.nih.gov/pubmed/35793793
http://dx.doi.org/10.1021/acs.jproteome.2c00334
work_keys_str_mv AT leejoonyong uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT mitchellhughd uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT burnetmeaganc uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT wuruonan uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT jensonsarahc uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT merkleyericd uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT nakayasuernestos uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT nicoracarried uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT janssonjanetk uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT burnumjohnsonkristine uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics
AT paynesamuelh uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics