Cargando…

De novo protein fold families expand the designable ligand binding site space

A major challenge in designing proteins de novo to bind user-defined ligands with high affinity is finding backbones structures into which a new binding site geometry can be engineered with high precision. Recent advances in methods to generate protein fold families de novo have expanded the space o...

Descripción completa

Detalles Bibliográficos
Autores principales: Pan, Xingjie, Kortemme, Tanja
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8648124/
https://www.ncbi.nlm.nih.gov/pubmed/34807909
http://dx.doi.org/10.1371/journal.pcbi.1009620
_version_ 1784610737772036096
author Pan, Xingjie
Kortemme, Tanja
author_facet Pan, Xingjie
Kortemme, Tanja
author_sort Pan, Xingjie
collection PubMed
description A major challenge in designing proteins de novo to bind user-defined ligands with high affinity is finding backbones structures into which a new binding site geometry can be engineered with high precision. Recent advances in methods to generate protein fold families de novo have expanded the space of accessible protein structures, but it is not clear to what extend de novo proteins with diverse geometries also expand the space of designable ligand binding functions. We constructed a library of 25,806 high-quality ligand binding sites and developed a fast protocol to place (“match”) these binding sites into both naturally occurring and de novo protein families with two fold topologies: Rossman and NTF2. Each matching step involves engineering new binding site residues into each protein “scaffold”, which is distinct from the problem of comparing already existing binding pockets. 5,896 and 7,475 binding sites could be matched to the Rossmann and NTF2 fold families, respectively. De novo designed Rossman and NTF2 protein families can support 1,791 and 678 binding sites that cannot be matched to naturally existing structures with the same topologies, respectively. While the number of protein residues in ligand binding sites is the major determinant of matching success, ligand size and primary sequence separation of binding site residues also play important roles. The number of matched binding sites are power law functions of the number of members in a fold family. Our results suggest that de novo sampling of geometric variations on diverse fold topologies can significantly expand the space of designable ligand binding sites for a wealth of possible new protein functions.
format Online
Article
Text
id pubmed-8648124
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-86481242021-12-07 De novo protein fold families expand the designable ligand binding site space Pan, Xingjie Kortemme, Tanja PLoS Comput Biol Research Article A major challenge in designing proteins de novo to bind user-defined ligands with high affinity is finding backbones structures into which a new binding site geometry can be engineered with high precision. Recent advances in methods to generate protein fold families de novo have expanded the space of accessible protein structures, but it is not clear to what extend de novo proteins with diverse geometries also expand the space of designable ligand binding functions. We constructed a library of 25,806 high-quality ligand binding sites and developed a fast protocol to place (“match”) these binding sites into both naturally occurring and de novo protein families with two fold topologies: Rossman and NTF2. Each matching step involves engineering new binding site residues into each protein “scaffold”, which is distinct from the problem of comparing already existing binding pockets. 5,896 and 7,475 binding sites could be matched to the Rossmann and NTF2 fold families, respectively. De novo designed Rossman and NTF2 protein families can support 1,791 and 678 binding sites that cannot be matched to naturally existing structures with the same topologies, respectively. While the number of protein residues in ligand binding sites is the major determinant of matching success, ligand size and primary sequence separation of binding site residues also play important roles. The number of matched binding sites are power law functions of the number of members in a fold family. Our results suggest that de novo sampling of geometric variations on diverse fold topologies can significantly expand the space of designable ligand binding sites for a wealth of possible new protein functions. Public Library of Science 2021-11-22 /pmc/articles/PMC8648124/ /pubmed/34807909 http://dx.doi.org/10.1371/journal.pcbi.1009620 Text en © 2021 Pan, Kortemme https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Pan, Xingjie
Kortemme, Tanja
De novo protein fold families expand the designable ligand binding site space
title De novo protein fold families expand the designable ligand binding site space
title_full De novo protein fold families expand the designable ligand binding site space
title_fullStr De novo protein fold families expand the designable ligand binding site space
title_full_unstemmed De novo protein fold families expand the designable ligand binding site space
title_short De novo protein fold families expand the designable ligand binding site space
title_sort de novo protein fold families expand the designable ligand binding site space
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8648124/
https://www.ncbi.nlm.nih.gov/pubmed/34807909
http://dx.doi.org/10.1371/journal.pcbi.1009620
work_keys_str_mv AT panxingjie denovoproteinfoldfamiliesexpandthedesignableligandbindingsitespace
AT kortemmetanja denovoproteinfoldfamiliesexpandthedesignableligandbindingsitespace