Cargando…
Considerations for constructing a protein sequence database for metaproteomics
Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8861567/ https://www.ncbi.nlm.nih.gov/pubmed/35242286 http://dx.doi.org/10.1016/j.csbj.2022.01.018 |
_version_ | 1784654911923814400 |
---|---|
author | Blakeley-Ruiz, J. Alfredo Kleiner, Manuel |
author_facet | Blakeley-Ruiz, J. Alfredo Kleiner, Manuel |
author_sort | Blakeley-Ruiz, J. Alfredo |
collection | PubMed |
description | Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to a database of protein sequences corresponding to the proteins in the sample. This sequence database determines which protein sequences can be identified from the measurement, and as such the taxonomic and functional information that can be inferred from a metaproteomics measurement. Thus, the construction of the protein sequence database directly impacts the outcome of any metaproteomics study. Several factors, such as source of sequence information and database curation, need to be considered during database construction to maximize accurate protein identifications traceable to the species of origin. In this review, we provide an overview of existing strategies for database construction and the relevant studies that have sought to test and validate these strategies. Based on this review of the literature and our experience we provide a decision tree and best practices for choosing and implementing database construction strategies. |
format | Online Article Text |
id | pubmed-8861567 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-88615672022-03-02 Considerations for constructing a protein sequence database for metaproteomics Blakeley-Ruiz, J. Alfredo Kleiner, Manuel Comput Struct Biotechnol J Review Article Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to a database of protein sequences corresponding to the proteins in the sample. This sequence database determines which protein sequences can be identified from the measurement, and as such the taxonomic and functional information that can be inferred from a metaproteomics measurement. Thus, the construction of the protein sequence database directly impacts the outcome of any metaproteomics study. Several factors, such as source of sequence information and database curation, need to be considered during database construction to maximize accurate protein identifications traceable to the species of origin. In this review, we provide an overview of existing strategies for database construction and the relevant studies that have sought to test and validate these strategies. Based on this review of the literature and our experience we provide a decision tree and best practices for choosing and implementing database construction strategies. Research Network of Computational and Structural Biotechnology 2022-01-21 /pmc/articles/PMC8861567/ /pubmed/35242286 http://dx.doi.org/10.1016/j.csbj.2022.01.018 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Review Article Blakeley-Ruiz, J. Alfredo Kleiner, Manuel Considerations for constructing a protein sequence database for metaproteomics |
title | Considerations for constructing a protein sequence database for metaproteomics |
title_full | Considerations for constructing a protein sequence database for metaproteomics |
title_fullStr | Considerations for constructing a protein sequence database for metaproteomics |
title_full_unstemmed | Considerations for constructing a protein sequence database for metaproteomics |
title_short | Considerations for constructing a protein sequence database for metaproteomics |
title_sort | considerations for constructing a protein sequence database for metaproteomics |
topic | Review Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8861567/ https://www.ncbi.nlm.nih.gov/pubmed/35242286 http://dx.doi.org/10.1016/j.csbj.2022.01.018 |
work_keys_str_mv | AT blakeleyruizjalfredo considerationsforconstructingaproteinsequencedatabaseformetaproteomics AT kleinermanuel considerationsforconstructingaproteinsequencedatabaseformetaproteomics |