Cargando…
Increasing the power of interpretation for soil metaproteomics data
BACKGROUND: Soil and sediment microorganisms are highly phylogenetically diverse but are currently largely under-represented in public molecular databases. Their functional characterization by means of metaproteomics is usually performed using metagenomic sequences acquired for the same sample. Howe...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8482631/ https://www.ncbi.nlm.nih.gov/pubmed/34587999 http://dx.doi.org/10.1186/s40168-021-01139-1 |
_version_ | 1784576948795604992 |
---|---|
author | Jouffret, Virginie Miotello, Guylaine Culotta, Karen Ayrault, Sophie Pible, Olivier Armengaud, Jean |
author_facet | Jouffret, Virginie Miotello, Guylaine Culotta, Karen Ayrault, Sophie Pible, Olivier Armengaud, Jean |
author_sort | Jouffret, Virginie |
collection | PubMed |
description | BACKGROUND: Soil and sediment microorganisms are highly phylogenetically diverse but are currently largely under-represented in public molecular databases. Their functional characterization by means of metaproteomics is usually performed using metagenomic sequences acquired for the same sample. However, such hugely diverse metagenomic datasets are difficult to assemble; in parallel, theoretical proteomes from isolates available in generic databases are of high quality. Both these factors advocate for the use of theoretical proteomes in metaproteomics interpretation pipelines. Here, we examined a number of database construction strategies with a view to increasing the outputs of metaproteomics studies performed on soil samples. RESULTS: The number of peptide-spectrum matches was found to be of comparable magnitude when using public or sample-specific metagenomics-derived databases. However, numbers were significantly increased when a combination of both types of information was used in a two-step cascaded search. Our data also indicate that the functional annotation of the metaproteomics dataset can be maximized by using a combination of both types of databases. CONCLUSIONS: A two-step strategy combining sample-specific metagenome database and public databases such as the non-redundant NCBI database and a massive soil gene catalog allows maximizing the metaproteomic interpretation both in terms of ratio of assigned spectra and retrieval of function-derived information. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40168-021-01139-1. |
format | Online Article Text |
id | pubmed-8482631 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-84826312021-10-04 Increasing the power of interpretation for soil metaproteomics data Jouffret, Virginie Miotello, Guylaine Culotta, Karen Ayrault, Sophie Pible, Olivier Armengaud, Jean Microbiome Methodology BACKGROUND: Soil and sediment microorganisms are highly phylogenetically diverse but are currently largely under-represented in public molecular databases. Their functional characterization by means of metaproteomics is usually performed using metagenomic sequences acquired for the same sample. However, such hugely diverse metagenomic datasets are difficult to assemble; in parallel, theoretical proteomes from isolates available in generic databases are of high quality. Both these factors advocate for the use of theoretical proteomes in metaproteomics interpretation pipelines. Here, we examined a number of database construction strategies with a view to increasing the outputs of metaproteomics studies performed on soil samples. RESULTS: The number of peptide-spectrum matches was found to be of comparable magnitude when using public or sample-specific metagenomics-derived databases. However, numbers were significantly increased when a combination of both types of information was used in a two-step cascaded search. Our data also indicate that the functional annotation of the metaproteomics dataset can be maximized by using a combination of both types of databases. CONCLUSIONS: A two-step strategy combining sample-specific metagenome database and public databases such as the non-redundant NCBI database and a massive soil gene catalog allows maximizing the metaproteomic interpretation both in terms of ratio of assigned spectra and retrieval of function-derived information. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40168-021-01139-1. BioMed Central 2021-09-29 /pmc/articles/PMC8482631/ /pubmed/34587999 http://dx.doi.org/10.1186/s40168-021-01139-1 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Jouffret, Virginie Miotello, Guylaine Culotta, Karen Ayrault, Sophie Pible, Olivier Armengaud, Jean Increasing the power of interpretation for soil metaproteomics data |
title | Increasing the power of interpretation for soil metaproteomics data |
title_full | Increasing the power of interpretation for soil metaproteomics data |
title_fullStr | Increasing the power of interpretation for soil metaproteomics data |
title_full_unstemmed | Increasing the power of interpretation for soil metaproteomics data |
title_short | Increasing the power of interpretation for soil metaproteomics data |
title_sort | increasing the power of interpretation for soil metaproteomics data |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8482631/ https://www.ncbi.nlm.nih.gov/pubmed/34587999 http://dx.doi.org/10.1186/s40168-021-01139-1 |
work_keys_str_mv | AT jouffretvirginie increasingthepowerofinterpretationforsoilmetaproteomicsdata AT miotelloguylaine increasingthepowerofinterpretationforsoilmetaproteomicsdata AT culottakaren increasingthepowerofinterpretationforsoilmetaproteomicsdata AT ayraultsophie increasingthepowerofinterpretationforsoilmetaproteomicsdata AT pibleolivier increasingthepowerofinterpretationforsoilmetaproteomicsdata AT armengaudjean increasingthepowerofinterpretationforsoilmetaproteomicsdata |