Cargando…

Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space

Over the last two decades, the number of gene/protein sequences gleaned from sequencing projects of individual genomes and environmental DNA has grown exponentially. Only a tiny fraction of these predicted proteins has been experimentally characterized, and the function of most proteins remains hypo...

Descripción completa

Detalles Bibliográficos
Autores principales: Helbert, William, Poulet, Laurent, Drouillard, Sophie, Mathieu, Sophie, Loiodice, Mélanie, Couturier, Marie, Lombard, Vincent, Terrapon, Nicolas, Turchetto, Jeremy, Vincentelli, Renaud, Henrissat, Bernard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: National Academy of Sciences 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6442616/
https://www.ncbi.nlm.nih.gov/pubmed/30850540
http://dx.doi.org/10.1073/pnas.1815791116
_version_ 1783407736215568384
author Helbert, William
Poulet, Laurent
Drouillard, Sophie
Mathieu, Sophie
Loiodice, Mélanie
Couturier, Marie
Lombard, Vincent
Terrapon, Nicolas
Turchetto, Jeremy
Vincentelli, Renaud
Henrissat, Bernard
author_facet Helbert, William
Poulet, Laurent
Drouillard, Sophie
Mathieu, Sophie
Loiodice, Mélanie
Couturier, Marie
Lombard, Vincent
Terrapon, Nicolas
Turchetto, Jeremy
Vincentelli, Renaud
Henrissat, Bernard
author_sort Helbert, William
collection PubMed
description Over the last two decades, the number of gene/protein sequences gleaned from sequencing projects of individual genomes and environmental DNA has grown exponentially. Only a tiny fraction of these predicted proteins has been experimentally characterized, and the function of most proteins remains hypothetical or only predicted based on sequence similarity. Despite the development of postgenomic methods, such as transcriptomics, proteomics, and metabolomics, the assignment of function to protein sequences remains one of the main challenges in modern biology. As in all classes of proteins, the growing number of predicted carbohydrate-active enzymes (CAZymes) has not been accompanied by a systematic and accurate attribution of function. Taking advantage of the CAZy database, which groups CAZymes into families and subfamilies based on amino acid similarities, we recombinantly produced 564 proteins selected from subfamilies without any biochemically characterized representatives, from distant relatives of characterized enzymes and from nonclassified proteins that show little similarity with known CAZymes. Screening these proteins for activity on a wide collection of carbohydrate substrates led to the discovery of 13 CAZyme families (two of which were also discovered by others during the course of our work), revealed three previously unknown substrate specificities, and assigned a function to 25 subfamilies.
format Online
Article
Text
id pubmed-6442616
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher National Academy of Sciences
record_format MEDLINE/PubMed
spelling pubmed-64426162019-04-05 Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space Helbert, William Poulet, Laurent Drouillard, Sophie Mathieu, Sophie Loiodice, Mélanie Couturier, Marie Lombard, Vincent Terrapon, Nicolas Turchetto, Jeremy Vincentelli, Renaud Henrissat, Bernard Proc Natl Acad Sci U S A Biological Sciences Over the last two decades, the number of gene/protein sequences gleaned from sequencing projects of individual genomes and environmental DNA has grown exponentially. Only a tiny fraction of these predicted proteins has been experimentally characterized, and the function of most proteins remains hypothetical or only predicted based on sequence similarity. Despite the development of postgenomic methods, such as transcriptomics, proteomics, and metabolomics, the assignment of function to protein sequences remains one of the main challenges in modern biology. As in all classes of proteins, the growing number of predicted carbohydrate-active enzymes (CAZymes) has not been accompanied by a systematic and accurate attribution of function. Taking advantage of the CAZy database, which groups CAZymes into families and subfamilies based on amino acid similarities, we recombinantly produced 564 proteins selected from subfamilies without any biochemically characterized representatives, from distant relatives of characterized enzymes and from nonclassified proteins that show little similarity with known CAZymes. Screening these proteins for activity on a wide collection of carbohydrate substrates led to the discovery of 13 CAZyme families (two of which were also discovered by others during the course of our work), revealed three previously unknown substrate specificities, and assigned a function to 25 subfamilies. National Academy of Sciences 2019-03-26 2019-03-08 /pmc/articles/PMC6442616/ /pubmed/30850540 http://dx.doi.org/10.1073/pnas.1815791116 Text en Copyright © 2019 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/ This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Biological Sciences
Helbert, William
Poulet, Laurent
Drouillard, Sophie
Mathieu, Sophie
Loiodice, Mélanie
Couturier, Marie
Lombard, Vincent
Terrapon, Nicolas
Turchetto, Jeremy
Vincentelli, Renaud
Henrissat, Bernard
Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space
title Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space
title_full Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space
title_fullStr Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space
title_full_unstemmed Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space
title_short Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space
title_sort discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space
topic Biological Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6442616/
https://www.ncbi.nlm.nih.gov/pubmed/30850540
http://dx.doi.org/10.1073/pnas.1815791116
work_keys_str_mv AT helbertwilliam discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT pouletlaurent discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT drouillardsophie discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT mathieusophie discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT loiodicemelanie discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT couturiermarie discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT lombardvincent discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT terraponnicolas discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT turchettojeremy discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT vincentellirenaud discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace
AT henrissatbernard discoveryofnovelcarbohydrateactiveenzymesthroughtherationalexplorationoftheproteinsequencesspace