Cargando…

Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets

BACKGROUND: Mass spectrometry-based proteomics can identify and quantify thousands of proteins from individual microbial species, but a significant percentage of these proteins are unannotated and hence classified as proteins of unknown function (PUFs). Due to the difficulty in extracting meaningful...

Descripción completa

Detalles Bibliográficos
Autores principales: Poudel, Suresh, Cope, Alexander L., O’Dell, Kaela B., Guss, Adam M., Seo, Hyeongmin, Trinh, Cong T., Hettich, Robert L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112048/
https://www.ncbi.nlm.nih.gov/pubmed/33971924
http://dx.doi.org/10.1186/s13068-021-01964-4
_version_ 1783690618773438464
author Poudel, Suresh
Cope, Alexander L.
O’Dell, Kaela B.
Guss, Adam M.
Seo, Hyeongmin
Trinh, Cong T.
Hettich, Robert L.
author_facet Poudel, Suresh
Cope, Alexander L.
O’Dell, Kaela B.
Guss, Adam M.
Seo, Hyeongmin
Trinh, Cong T.
Hettich, Robert L.
author_sort Poudel, Suresh
collection PubMed
description BACKGROUND: Mass spectrometry-based proteomics can identify and quantify thousands of proteins from individual microbial species, but a significant percentage of these proteins are unannotated and hence classified as proteins of unknown function (PUFs). Due to the difficulty in extracting meaningful metabolic information, PUFs are often overlooked or discarded during data analysis, even though they might be critically important in functional activities, in particular for metabolic engineering research. RESULTS: We optimized and employed a pipeline integrating various “guilt-by-association” (GBA) metrics, including differential expression and co-expression analyses of high-throughput mass spectrometry proteome data and phylogenetic coevolution analysis, and sequence homology-based approaches to determine putative functions for PUFs in Clostridium thermocellum. Our various analyses provided putative functional information for over 95% of the PUFs detected by mass spectrometry in a wild-type and/or an engineered strain of C. thermocellum. In particular, we validated a predicted acyltransferase PUF (WP_003519433.1) with functional activity towards 2-phenylethyl alcohol, consistent with our GBA and sequence homology-based predictions. CONCLUSIONS: This work demonstrates the value of leveraging sequence homology-based annotations with empirical evidence based on the concept of GBA to broadly predict putative functions for PUFs, opening avenues to further interrogation via targeted experiments. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13068-021-01964-4.
format Online
Article
Text
id pubmed-8112048
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81120482021-05-12 Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets Poudel, Suresh Cope, Alexander L. O’Dell, Kaela B. Guss, Adam M. Seo, Hyeongmin Trinh, Cong T. Hettich, Robert L. Biotechnol Biofuels Research BACKGROUND: Mass spectrometry-based proteomics can identify and quantify thousands of proteins from individual microbial species, but a significant percentage of these proteins are unannotated and hence classified as proteins of unknown function (PUFs). Due to the difficulty in extracting meaningful metabolic information, PUFs are often overlooked or discarded during data analysis, even though they might be critically important in functional activities, in particular for metabolic engineering research. RESULTS: We optimized and employed a pipeline integrating various “guilt-by-association” (GBA) metrics, including differential expression and co-expression analyses of high-throughput mass spectrometry proteome data and phylogenetic coevolution analysis, and sequence homology-based approaches to determine putative functions for PUFs in Clostridium thermocellum. Our various analyses provided putative functional information for over 95% of the PUFs detected by mass spectrometry in a wild-type and/or an engineered strain of C. thermocellum. In particular, we validated a predicted acyltransferase PUF (WP_003519433.1) with functional activity towards 2-phenylethyl alcohol, consistent with our GBA and sequence homology-based predictions. CONCLUSIONS: This work demonstrates the value of leveraging sequence homology-based annotations with empirical evidence based on the concept of GBA to broadly predict putative functions for PUFs, opening avenues to further interrogation via targeted experiments. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13068-021-01964-4. BioMed Central 2021-05-10 /pmc/articles/PMC8112048/ /pubmed/33971924 http://dx.doi.org/10.1186/s13068-021-01964-4 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Poudel, Suresh
Cope, Alexander L.
O’Dell, Kaela B.
Guss, Adam M.
Seo, Hyeongmin
Trinh, Cong T.
Hettich, Robert L.
Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets
title Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets
title_full Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets
title_fullStr Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets
title_full_unstemmed Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets
title_short Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets
title_sort identification and characterization of proteins of unknown function (pufs) in clostridium thermocellum dsm 1313 strains as potential genetic engineering targets
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112048/
https://www.ncbi.nlm.nih.gov/pubmed/33971924
http://dx.doi.org/10.1186/s13068-021-01964-4
work_keys_str_mv AT poudelsuresh identificationandcharacterizationofproteinsofunknownfunctionpufsinclostridiumthermocellumdsm1313strainsaspotentialgeneticengineeringtargets
AT copealexanderl identificationandcharacterizationofproteinsofunknownfunctionpufsinclostridiumthermocellumdsm1313strainsaspotentialgeneticengineeringtargets
AT odellkaelab identificationandcharacterizationofproteinsofunknownfunctionpufsinclostridiumthermocellumdsm1313strainsaspotentialgeneticengineeringtargets
AT gussadamm identificationandcharacterizationofproteinsofunknownfunctionpufsinclostridiumthermocellumdsm1313strainsaspotentialgeneticengineeringtargets
AT seohyeongmin identificationandcharacterizationofproteinsofunknownfunctionpufsinclostridiumthermocellumdsm1313strainsaspotentialgeneticengineeringtargets
AT trinhcongt identificationandcharacterizationofproteinsofunknownfunctionpufsinclostridiumthermocellumdsm1313strainsaspotentialgeneticengineeringtargets
AT hettichrobertl identificationandcharacterizationofproteinsofunknownfunctionpufsinclostridiumthermocellumdsm1313strainsaspotentialgeneticengineeringtargets