Cargando…
Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data
BACKGROUND: A few recent large efforts significantly expanded the collection of human-associated bacterial genomes, which now contains thousands of entities including reference complete/draft genomes and metagenome assembled genomes (MAGs). These genomes provide useful resource for studying the func...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8017886/ https://www.ncbi.nlm.nih.gov/pubmed/33795009 http://dx.doi.org/10.1186/s40168-021-01035-8 |
_version_ | 1783674136822808576 |
---|---|
author | Stamboulian, Moses Li, Sujun Ye, Yuzhen |
author_facet | Stamboulian, Moses Li, Sujun Ye, Yuzhen |
author_sort | Stamboulian, Moses |
collection | PubMed |
description | BACKGROUND: A few recent large efforts significantly expanded the collection of human-associated bacterial genomes, which now contains thousands of entities including reference complete/draft genomes and metagenome assembled genomes (MAGs). These genomes provide useful resource for studying the functionality of the human-associated microbiome and their relationship with human health and diseases. One application of these genomes is to provide a universal reference for database search in metaproteomic studies, when matched metagenomic/metatranscriptomic data are unavailable. However, a greater collection of reference genomes may not necessarily result in better peptide/protein identification because the increase of search space often leads to fewer spectrum-peptide matches, not to mention the drastic increase of computation time. METHODS: Here, we present a new approach that uses two steps to optimize the use of the reference genomes and MAGs as the universal reference for human gut metaproteomic MS/MS data analysis. The first step is to use only the high-abundance proteins (HAPs) (i.e., ribosomal proteins and elongation factors) for metaproteomic MS/MS database search and, based on the identification results, to derive the taxonomic composition of the underlying microbial community. The second step is to expand the search database by including all proteins from identified abundant species. We call our approach HAPiID (HAPs guided metaproteomics IDentification). RESULTS: We tested our approach using human gut metaproteomic datasets from a previous study and compared it to the state-of-the-art reference database search method MetaPro-IQ for metaproteomic identification in studying human gut microbiota. Our results show that our two-steps method not only performed significantly faster but also was able to identify more peptides. We further demonstrated the application of HAPiID to revealing protein profiles of individual human-associated bacterial species, one or a few species at a time, using metaproteomic data. CONCLUSIONS: The HAP guided profiling approach presents a novel effective way for constructing target database for metaproteomic data analysis. The HAPiID pipeline built upon this approach provides a universal tool for analyzing human gut-associated metaproteomic data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s40168-021-01035-8). |
format | Online Article Text |
id | pubmed-8017886 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-80178862021-04-05 Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data Stamboulian, Moses Li, Sujun Ye, Yuzhen Microbiome Research BACKGROUND: A few recent large efforts significantly expanded the collection of human-associated bacterial genomes, which now contains thousands of entities including reference complete/draft genomes and metagenome assembled genomes (MAGs). These genomes provide useful resource for studying the functionality of the human-associated microbiome and their relationship with human health and diseases. One application of these genomes is to provide a universal reference for database search in metaproteomic studies, when matched metagenomic/metatranscriptomic data are unavailable. However, a greater collection of reference genomes may not necessarily result in better peptide/protein identification because the increase of search space often leads to fewer spectrum-peptide matches, not to mention the drastic increase of computation time. METHODS: Here, we present a new approach that uses two steps to optimize the use of the reference genomes and MAGs as the universal reference for human gut metaproteomic MS/MS data analysis. The first step is to use only the high-abundance proteins (HAPs) (i.e., ribosomal proteins and elongation factors) for metaproteomic MS/MS database search and, based on the identification results, to derive the taxonomic composition of the underlying microbial community. The second step is to expand the search database by including all proteins from identified abundant species. We call our approach HAPiID (HAPs guided metaproteomics IDentification). RESULTS: We tested our approach using human gut metaproteomic datasets from a previous study and compared it to the state-of-the-art reference database search method MetaPro-IQ for metaproteomic identification in studying human gut microbiota. Our results show that our two-steps method not only performed significantly faster but also was able to identify more peptides. We further demonstrated the application of HAPiID to revealing protein profiles of individual human-associated bacterial species, one or a few species at a time, using metaproteomic data. CONCLUSIONS: The HAP guided profiling approach presents a novel effective way for constructing target database for metaproteomic data analysis. The HAPiID pipeline built upon this approach provides a universal tool for analyzing human gut-associated metaproteomic data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s40168-021-01035-8). BioMed Central 2021-04-01 /pmc/articles/PMC8017886/ /pubmed/33795009 http://dx.doi.org/10.1186/s40168-021-01035-8 Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Stamboulian, Moses Li, Sujun Ye, Yuzhen Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data |
title | Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data |
title_full | Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data |
title_fullStr | Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data |
title_full_unstemmed | Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data |
title_short | Using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data |
title_sort | using high-abundance proteins as guides for fast and effective peptide/protein identification from human gut metaproteomic data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8017886/ https://www.ncbi.nlm.nih.gov/pubmed/33795009 http://dx.doi.org/10.1186/s40168-021-01035-8 |
work_keys_str_mv | AT stamboulianmoses usinghighabundanceproteinsasguidesforfastandeffectivepeptideproteinidentificationfromhumangutmetaproteomicdata AT lisujun usinghighabundanceproteinsasguidesforfastandeffectivepeptideproteinidentificationfromhumangutmetaproteomicdata AT yeyuzhen usinghighabundanceproteinsasguidesforfastandeffectivepeptideproteinidentificationfromhumangutmetaproteomicdata |