Cargando…

MetaLP: An integrative linear programming method for protein inference in metaproteomics

Metaproteomics based on high-throughput tandem mass spectrometry (MS/MS) plays a crucial role in characterizing microbiome functions. The acquired MS/MS data is searched against a protein sequence database to identify peptides, which are then used to infer a list of proteins present in a metaproteom...

Descripción completa

Detalles Bibliográficos
Autores principales: Feng, Shichao, Ji, Hong-Long, Wang, Huan, Zhang, Bailu, Sterzenbach, Ryan, Pan, Chongle, Guo, Xuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9629623/
https://www.ncbi.nlm.nih.gov/pubmed/36269761
http://dx.doi.org/10.1371/journal.pcbi.1010603
_version_ 1784823437445824512
author Feng, Shichao
Ji, Hong-Long
Wang, Huan
Zhang, Bailu
Sterzenbach, Ryan
Pan, Chongle
Guo, Xuan
author_facet Feng, Shichao
Ji, Hong-Long
Wang, Huan
Zhang, Bailu
Sterzenbach, Ryan
Pan, Chongle
Guo, Xuan
author_sort Feng, Shichao
collection PubMed
description Metaproteomics based on high-throughput tandem mass spectrometry (MS/MS) plays a crucial role in characterizing microbiome functions. The acquired MS/MS data is searched against a protein sequence database to identify peptides, which are then used to infer a list of proteins present in a metaproteome sample. While the problem of protein inference has been well-studied for proteomics of single organisms, it remains a major challenge for metaproteomics of complex microbial communities because of the large number of degenerate peptides shared among homologous proteins in different organisms. This challenge calls for improved discrimination of true protein identifications from false protein identifications given a set of unique and degenerate peptides identified in metaproteomics. MetaLP was developed here for protein inference in metaproteomics using an integrative linear programming method. Taxonomic abundance information extracted from metagenomics shotgun sequencing or 16s rRNA gene amplicon sequencing, was incorporated as prior information in MetaLP. Benchmarking with mock, human gut, soil, and marine microbial communities demonstrated significantly higher numbers of protein identifications by MetaLP than ProteinLP, PeptideProphet, DeepPep, PIPQ, and Sipros Ensemble. In conclusion, MetaLP could substantially improve protein inference for complex metaproteomes by incorporating taxonomic abundance information in a linear programming model.
format Online
Article
Text
id pubmed-9629623
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-96296232022-11-03 MetaLP: An integrative linear programming method for protein inference in metaproteomics Feng, Shichao Ji, Hong-Long Wang, Huan Zhang, Bailu Sterzenbach, Ryan Pan, Chongle Guo, Xuan PLoS Comput Biol Research Article Metaproteomics based on high-throughput tandem mass spectrometry (MS/MS) plays a crucial role in characterizing microbiome functions. The acquired MS/MS data is searched against a protein sequence database to identify peptides, which are then used to infer a list of proteins present in a metaproteome sample. While the problem of protein inference has been well-studied for proteomics of single organisms, it remains a major challenge for metaproteomics of complex microbial communities because of the large number of degenerate peptides shared among homologous proteins in different organisms. This challenge calls for improved discrimination of true protein identifications from false protein identifications given a set of unique and degenerate peptides identified in metaproteomics. MetaLP was developed here for protein inference in metaproteomics using an integrative linear programming method. Taxonomic abundance information extracted from metagenomics shotgun sequencing or 16s rRNA gene amplicon sequencing, was incorporated as prior information in MetaLP. Benchmarking with mock, human gut, soil, and marine microbial communities demonstrated significantly higher numbers of protein identifications by MetaLP than ProteinLP, PeptideProphet, DeepPep, PIPQ, and Sipros Ensemble. In conclusion, MetaLP could substantially improve protein inference for complex metaproteomes by incorporating taxonomic abundance information in a linear programming model. Public Library of Science 2022-10-21 /pmc/articles/PMC9629623/ /pubmed/36269761 http://dx.doi.org/10.1371/journal.pcbi.1010603 Text en © 2022 Feng et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Feng, Shichao
Ji, Hong-Long
Wang, Huan
Zhang, Bailu
Sterzenbach, Ryan
Pan, Chongle
Guo, Xuan
MetaLP: An integrative linear programming method for protein inference in metaproteomics
title MetaLP: An integrative linear programming method for protein inference in metaproteomics
title_full MetaLP: An integrative linear programming method for protein inference in metaproteomics
title_fullStr MetaLP: An integrative linear programming method for protein inference in metaproteomics
title_full_unstemmed MetaLP: An integrative linear programming method for protein inference in metaproteomics
title_short MetaLP: An integrative linear programming method for protein inference in metaproteomics
title_sort metalp: an integrative linear programming method for protein inference in metaproteomics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9629623/
https://www.ncbi.nlm.nih.gov/pubmed/36269761
http://dx.doi.org/10.1371/journal.pcbi.1010603
work_keys_str_mv AT fengshichao metalpanintegrativelinearprogrammingmethodforproteininferenceinmetaproteomics
AT jihonglong metalpanintegrativelinearprogrammingmethodforproteininferenceinmetaproteomics
AT wanghuan metalpanintegrativelinearprogrammingmethodforproteininferenceinmetaproteomics
AT zhangbailu metalpanintegrativelinearprogrammingmethodforproteininferenceinmetaproteomics
AT sterzenbachryan metalpanintegrativelinearprogrammingmethodforproteininferenceinmetaproteomics
AT panchongle metalpanintegrativelinearprogrammingmethodforproteininferenceinmetaproteomics
AT guoxuan metalpanintegrativelinearprogrammingmethodforproteininferenceinmetaproteomics