Cargando…

Considerations for constructing a protein sequence database for metaproteomics

Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to...

Descripción completa

Detalles Bibliográficos
Autores principales: Blakeley-Ruiz, J. Alfredo, Kleiner, Manuel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8861567/
https://www.ncbi.nlm.nih.gov/pubmed/35242286
http://dx.doi.org/10.1016/j.csbj.2022.01.018
_version_ 1784654911923814400
author Blakeley-Ruiz, J. Alfredo
Kleiner, Manuel
author_facet Blakeley-Ruiz, J. Alfredo
Kleiner, Manuel
author_sort Blakeley-Ruiz, J. Alfredo
collection PubMed
description Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to a database of protein sequences corresponding to the proteins in the sample. This sequence database determines which protein sequences can be identified from the measurement, and as such the taxonomic and functional information that can be inferred from a metaproteomics measurement. Thus, the construction of the protein sequence database directly impacts the outcome of any metaproteomics study. Several factors, such as source of sequence information and database curation, need to be considered during database construction to maximize accurate protein identifications traceable to the species of origin. In this review, we provide an overview of existing strategies for database construction and the relevant studies that have sought to test and validate these strategies. Based on this review of the literature and our experience we provide a decision tree and best practices for choosing and implementing database construction strategies.
format Online
Article
Text
id pubmed-8861567
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-88615672022-03-02 Considerations for constructing a protein sequence database for metaproteomics Blakeley-Ruiz, J. Alfredo Kleiner, Manuel Comput Struct Biotechnol J Review Article Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to a database of protein sequences corresponding to the proteins in the sample. This sequence database determines which protein sequences can be identified from the measurement, and as such the taxonomic and functional information that can be inferred from a metaproteomics measurement. Thus, the construction of the protein sequence database directly impacts the outcome of any metaproteomics study. Several factors, such as source of sequence information and database curation, need to be considered during database construction to maximize accurate protein identifications traceable to the species of origin. In this review, we provide an overview of existing strategies for database construction and the relevant studies that have sought to test and validate these strategies. Based on this review of the literature and our experience we provide a decision tree and best practices for choosing and implementing database construction strategies. Research Network of Computational and Structural Biotechnology 2022-01-21 /pmc/articles/PMC8861567/ /pubmed/35242286 http://dx.doi.org/10.1016/j.csbj.2022.01.018 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Review Article
Blakeley-Ruiz, J. Alfredo
Kleiner, Manuel
Considerations for constructing a protein sequence database for metaproteomics
title Considerations for constructing a protein sequence database for metaproteomics
title_full Considerations for constructing a protein sequence database for metaproteomics
title_fullStr Considerations for constructing a protein sequence database for metaproteomics
title_full_unstemmed Considerations for constructing a protein sequence database for metaproteomics
title_short Considerations for constructing a protein sequence database for metaproteomics
title_sort considerations for constructing a protein sequence database for metaproteomics
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8861567/
https://www.ncbi.nlm.nih.gov/pubmed/35242286
http://dx.doi.org/10.1016/j.csbj.2022.01.018
work_keys_str_mv AT blakeleyruizjalfredo considerationsforconstructingaproteinsequencedatabaseformetaproteomics
AT kleinermanuel considerationsforconstructingaproteinsequencedatabaseformetaproteomics