Cargando…
Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics
[Image: see text] Metaproteomics has been increasingly utilized for high-throughput characterization of proteins in complex environments and has been demonstrated to provide insights into microbial composition and functional roles. However, significant challenges remain in metaproteomic data analysi...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2022
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361346/ https://www.ncbi.nlm.nih.gov/pubmed/35793793 http://dx.doi.org/10.1021/acs.jproteome.2c00334 |
_version_ | 1784764513151614976 |
---|---|
author | Lee, Joon-Yong Mitchell, Hugh D. Burnet, Meagan C. Wu, Ruonan Jenson, Sarah C. Merkley, Eric D. Nakayasu, Ernesto S. Nicora, Carrie D. Jansson, Janet K. Burnum-Johnson, Kristin E. Payne, Samuel H. |
author_facet | Lee, Joon-Yong Mitchell, Hugh D. Burnet, Meagan C. Wu, Ruonan Jenson, Sarah C. Merkley, Eric D. Nakayasu, Ernesto S. Nicora, Carrie D. Jansson, Janet K. Burnum-Johnson, Kristin E. Payne, Samuel H. |
author_sort | Lee, Joon-Yong |
collection | PubMed |
description | [Image: see text] Metaproteomics has been increasingly utilized for high-throughput characterization of proteins in complex environments and has been demonstrated to provide insights into microbial composition and functional roles. However, significant challenges remain in metaproteomic data analysis, including creation of a sample-specific protein sequence database. A well-matched database is a requirement for successful metaproteomics analysis, and the accuracy and sensitivity of PSM identification algorithms suffer when the database is incomplete or contains extraneous sequences. When matched DNA sequencing data of the sample is unavailable or incomplete, creating the proteome database that accurately represents the organisms in the sample is a challenge. Here, we leverage a de novo peptide sequencing approach to identify the sample composition directly from metaproteomic data. First, we created a deep learning model, Kaiko, to predict the peptide sequences from mass spectrometry data and trained it on 5 million peptide–spectrum matches from 55 phylogenetically diverse bacteria. After training, Kaiko successfully identified organisms from soil isolates and synthetic communities directly from proteomics data. Finally, we created a pipeline for metaproteome database generation using Kaiko. We tested the pipeline on native soils collected in Kansas, showing that the de novo sequencing model can be employed as an alternative and complementary method to construct the sample-specific protein database instead of relying on (un)matched metagenomes. Our pipeline identified all highly abundant taxa from 16S rRNA sequencing of the soil samples and uncovered several additional species which were strongly represented only in proteomic data. |
format | Online Article Text |
id | pubmed-9361346 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-93613462022-08-10 Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics Lee, Joon-Yong Mitchell, Hugh D. Burnet, Meagan C. Wu, Ruonan Jenson, Sarah C. Merkley, Eric D. Nakayasu, Ernesto S. Nicora, Carrie D. Jansson, Janet K. Burnum-Johnson, Kristin E. Payne, Samuel H. J Proteome Res [Image: see text] Metaproteomics has been increasingly utilized for high-throughput characterization of proteins in complex environments and has been demonstrated to provide insights into microbial composition and functional roles. However, significant challenges remain in metaproteomic data analysis, including creation of a sample-specific protein sequence database. A well-matched database is a requirement for successful metaproteomics analysis, and the accuracy and sensitivity of PSM identification algorithms suffer when the database is incomplete or contains extraneous sequences. When matched DNA sequencing data of the sample is unavailable or incomplete, creating the proteome database that accurately represents the organisms in the sample is a challenge. Here, we leverage a de novo peptide sequencing approach to identify the sample composition directly from metaproteomic data. First, we created a deep learning model, Kaiko, to predict the peptide sequences from mass spectrometry data and trained it on 5 million peptide–spectrum matches from 55 phylogenetically diverse bacteria. After training, Kaiko successfully identified organisms from soil isolates and synthetic communities directly from proteomics data. Finally, we created a pipeline for metaproteome database generation using Kaiko. We tested the pipeline on native soils collected in Kansas, showing that the de novo sequencing model can be employed as an alternative and complementary method to construct the sample-specific protein database instead of relying on (un)matched metagenomes. Our pipeline identified all highly abundant taxa from 16S rRNA sequencing of the soil samples and uncovered several additional species which were strongly represented only in proteomic data. American Chemical Society 2022-07-06 2022-08-05 /pmc/articles/PMC9361346/ /pubmed/35793793 http://dx.doi.org/10.1021/acs.jproteome.2c00334 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Lee, Joon-Yong Mitchell, Hugh D. Burnet, Meagan C. Wu, Ruonan Jenson, Sarah C. Merkley, Eric D. Nakayasu, Ernesto S. Nicora, Carrie D. Jansson, Janet K. Burnum-Johnson, Kristin E. Payne, Samuel H. Uncovering Hidden Members and Functions of the Soil Microbiome Using De Novo Metaproteomics |
title | Uncovering Hidden
Members and Functions of the Soil
Microbiome Using De Novo Metaproteomics |
title_full | Uncovering Hidden
Members and Functions of the Soil
Microbiome Using De Novo Metaproteomics |
title_fullStr | Uncovering Hidden
Members and Functions of the Soil
Microbiome Using De Novo Metaproteomics |
title_full_unstemmed | Uncovering Hidden
Members and Functions of the Soil
Microbiome Using De Novo Metaproteomics |
title_short | Uncovering Hidden
Members and Functions of the Soil
Microbiome Using De Novo Metaproteomics |
title_sort | uncovering hidden
members and functions of the soil
microbiome using de novo metaproteomics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361346/ https://www.ncbi.nlm.nih.gov/pubmed/35793793 http://dx.doi.org/10.1021/acs.jproteome.2c00334 |
work_keys_str_mv | AT leejoonyong uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT mitchellhughd uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT burnetmeaganc uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT wuruonan uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT jensonsarahc uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT merkleyericd uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT nakayasuernestos uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT nicoracarried uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT janssonjanetk uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT burnumjohnsonkristine uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics AT paynesamuelh uncoveringhiddenmembersandfunctionsofthesoilmicrobiomeusingdenovometaproteomics |