Cargando…

Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data

BACKGROUND: A few recent large efforts significantly expanded the collection of human-associated bacterial genomes, which now contains thousands of entities including reference complete/draft genomes and metagenome assembled genomes (MAGs). These genomes provide useful resource for studying the func...

Descripción completa

Detalles Bibliográficos
Autores principales: Stamboulian, Moses, Li, Sujun, Ye, Yuzhen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8017886/
https://www.ncbi.nlm.nih.gov/pubmed/33795009
http://dx.doi.org/10.1186/s40168-021-01035-8
_version_ 1783674136822808576
author Stamboulian, Moses
Li, Sujun
Ye, Yuzhen
author_facet Stamboulian, Moses
Li, Sujun
Ye, Yuzhen
author_sort Stamboulian, Moses
collection PubMed
description BACKGROUND: A few recent large efforts significantly expanded the collection of human-associated bacterial genomes, which now contains thousands of entities including reference complete/draft genomes and metagenome assembled genomes (MAGs). These genomes provide useful resource for studying the functionality of the human-associated microbiome and their relationship with human health and diseases. One application of these genomes is to provide a universal reference for database search in metaproteomic studies, when matched metagenomic/metatranscriptomic data are unavailable. However, a greater collection of reference genomes may not necessarily result in better peptide/protein identification because the increase of search space often leads to fewer spectrum-peptide matches, not to mention the drastic increase of computation time. METHODS: Here, we present a new approach that uses two steps to optimize the use of the reference genomes and MAGs as the universal reference for human gut metaproteomic MS/MS data analysis. The first step is to use only the high-abundance proteins (HAPs) (i.e., ribosomal proteins and elongation factors) for metaproteomic MS/MS database search and, based on the identification results, to derive the taxonomic composition of the underlying microbial community. The second step is to expand the search database by including all proteins from identified abundant species. We call our approach HAPiID (HAPs guided metaproteomics IDentification). RESULTS: We tested our approach using human gut metaproteomic datasets from a previous study and compared it to the state-of-the-art reference database search method MetaPro-IQ for metaproteomic identification in studying human gut microbiota. Our results show that our two-steps method not only performed significantly faster but also was able to identify more peptides. We further demonstrated the application of HAPiID to revealing protein profiles of individual human-associated bacterial species, one or a few species at a time, using metaproteomic data. CONCLUSIONS: The HAP guided profiling approach presents a novel effective way for constructing target database for metaproteomic data analysis. The HAPiID pipeline built upon this approach provides a universal tool for analyzing human gut-associated metaproteomic data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s40168-021-01035-8).
format Online
Article
Text
id pubmed-8017886
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-80178862021-04-05 Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data Stamboulian, Moses Li, Sujun Ye, Yuzhen Microbiome Research BACKGROUND: A few recent large efforts significantly expanded the collection of human-associated bacterial genomes, which now contains thousands of entities including reference complete/draft genomes and metagenome assembled genomes (MAGs). These genomes provide useful resource for studying the functionality of the human-associated microbiome and their relationship with human health and diseases. One application of these genomes is to provide a universal reference for database search in metaproteomic studies, when matched metagenomic/metatranscriptomic data are unavailable. However, a greater collection of reference genomes may not necessarily result in better peptide/protein identification because the increase of search space often leads to fewer spectrum-peptide matches, not to mention the drastic increase of computation time. METHODS: Here, we present a new approach that uses two steps to optimize the use of the reference genomes and MAGs as the universal reference for human gut metaproteomic MS/MS data analysis. The first step is to use only the high-abundance proteins (HAPs) (i.e., ribosomal proteins and elongation factors) for metaproteomic MS/MS database search and, based on the identification results, to derive the taxonomic composition of the underlying microbial community. The second step is to expand the search database by including all proteins from identified abundant species. We call our approach HAPiID (HAPs guided metaproteomics IDentification). RESULTS: We tested our approach using human gut metaproteomic datasets from a previous study and compared it to the state-of-the-art reference database search method MetaPro-IQ for metaproteomic identification in studying human gut microbiota. Our results show that our two-steps method not only performed significantly faster but also was able to identify more peptides. We further demonstrated the application of HAPiID to revealing protein profiles of individual human-associated bacterial species, one or a few species at a time, using metaproteomic data. CONCLUSIONS: The HAP guided profiling approach presents a novel effective way for constructing target database for metaproteomic data analysis. The HAPiID pipeline built upon this approach provides a universal tool for analyzing human gut-associated metaproteomic data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s40168-021-01035-8). BioMed Central 2021-04-01 /pmc/articles/PMC8017886/ /pubmed/33795009 http://dx.doi.org/10.1186/s40168-021-01035-8 Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Stamboulian, Moses
Li, Sujun
Ye, Yuzhen
Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data
title Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data
title_full Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data
title_fullStr Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data
title_full_unstemmed Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data
title_short Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data
title_sort using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8017886/
https://www.ncbi.nlm.nih.gov/pubmed/33795009
http://dx.doi.org/10.1186/s40168-021-01035-8
work_keys_str_mv AT stamboulianmoses usinghighabundanceproteinsasguidesforfastandeffectivepeptideproteinidentificationfromhumangutmetaproteomicdata
AT lisujun usinghighabundanceproteinsasguidesforfastandeffectivepeptideproteinidentificationfromhumangutmetaproteomicdata
AT yeyuzhen usinghighabundanceproteinsasguidesforfastandeffectivepeptideproteinidentificationfromhumangutmetaproteomicdata