Cargando…

Robust deep learning based protein sequence design using ProteinMPNN

While deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta. Here we describe a deep learning based protein sequence design method, ProteinMPNN, with outstan...

Descripción completa

Detalles Bibliográficos
Autores principales: Dauparas, J., Anishchenko, I., Bennett, N., Bai, H., Ragotte, R. J., Milles, L. F., Wicky, B. I. M., Courbet, A., de Haas, R. J., Bethel, N., Leung, P. J. Y., Huddy, T. F., Pellock, S., Tischer, D., Chan, F., Koepnick, B., Nguyen, H., Kang, A., Sankaran, B., Bera, A. K., King, N. P., Baker, D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9997061/
https://www.ncbi.nlm.nih.gov/pubmed/36108050
http://dx.doi.org/10.1126/science.add2187
_version_ 1784903182204272640
author Dauparas, J.
Anishchenko, I.
Bennett, N.
Bai, H.
Ragotte, R. J.
Milles, L. F.
Wicky, B. I. M.
Courbet, A.
de Haas, R. J.
Bethel, N.
Leung, P. J. Y.
Huddy, T. F.
Pellock, S.
Tischer, D.
Chan, F.
Koepnick, B.
Nguyen, H.
Kang, A.
Sankaran, B.
Bera, A. K.
King, N. P.
Baker, D.
author_facet Dauparas, J.
Anishchenko, I.
Bennett, N.
Bai, H.
Ragotte, R. J.
Milles, L. F.
Wicky, B. I. M.
Courbet, A.
de Haas, R. J.
Bethel, N.
Leung, P. J. Y.
Huddy, T. F.
Pellock, S.
Tischer, D.
Chan, F.
Koepnick, B.
Nguyen, H.
Kang, A.
Sankaran, B.
Bera, A. K.
King, N. P.
Baker, D.
author_sort Dauparas, J.
collection PubMed
description While deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta. Here we describe a deep learning based protein sequence design method, ProteinMPNN, with outstanding performance in both in silico and experimental tests. The amino acid sequence at different positions can be coupled between single or multiple chains, enabling application to a wide range of current protein design challenges. On native protein backbones, ProteinMPNN has a sequence recovery of 52.4%, compared to 32.9% for Rosetta. Incorporation of noise during training improves sequence recovery on protein structure models, and produces sequences which more robustly encode their structures as assessed using structure prediction algorithms. We demonstrate the broad utility and high accuracy of ProteinMPNN using X-ray crystallography, cryoEM and functional studies by rescuing previously failed designs, made using Rosetta or AlphaFold, of protein monomers, cyclic homo-oligomers, tetrahedral nanoparticles, and target binding proteins.
format Online
Article
Text
id pubmed-9997061
institution National Center for Biotechnology Information
language English
publishDate 2022
record_format MEDLINE/PubMed
spelling pubmed-99970612023-03-09 Robust deep learning based protein sequence design using ProteinMPNN Dauparas, J. Anishchenko, I. Bennett, N. Bai, H. Ragotte, R. J. Milles, L. F. Wicky, B. I. M. Courbet, A. de Haas, R. J. Bethel, N. Leung, P. J. Y. Huddy, T. F. Pellock, S. Tischer, D. Chan, F. Koepnick, B. Nguyen, H. Kang, A. Sankaran, B. Bera, A. K. King, N. P. Baker, D. Science Article While deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta. Here we describe a deep learning based protein sequence design method, ProteinMPNN, with outstanding performance in both in silico and experimental tests. The amino acid sequence at different positions can be coupled between single or multiple chains, enabling application to a wide range of current protein design challenges. On native protein backbones, ProteinMPNN has a sequence recovery of 52.4%, compared to 32.9% for Rosetta. Incorporation of noise during training improves sequence recovery on protein structure models, and produces sequences which more robustly encode their structures as assessed using structure prediction algorithms. We demonstrate the broad utility and high accuracy of ProteinMPNN using X-ray crystallography, cryoEM and functional studies by rescuing previously failed designs, made using Rosetta or AlphaFold, of protein monomers, cyclic homo-oligomers, tetrahedral nanoparticles, and target binding proteins. 2022-10-07 2022-09-15 /pmc/articles/PMC9997061/ /pubmed/36108050 http://dx.doi.org/10.1126/science.add2187 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Dauparas, J.
Anishchenko, I.
Bennett, N.
Bai, H.
Ragotte, R. J.
Milles, L. F.
Wicky, B. I. M.
Courbet, A.
de Haas, R. J.
Bethel, N.
Leung, P. J. Y.
Huddy, T. F.
Pellock, S.
Tischer, D.
Chan, F.
Koepnick, B.
Nguyen, H.
Kang, A.
Sankaran, B.
Bera, A. K.
King, N. P.
Baker, D.
Robust deep learning based protein sequence design using ProteinMPNN
title Robust deep learning based protein sequence design using ProteinMPNN
title_full Robust deep learning based protein sequence design using ProteinMPNN
title_fullStr Robust deep learning based protein sequence design using ProteinMPNN
title_full_unstemmed Robust deep learning based protein sequence design using ProteinMPNN
title_short Robust deep learning based protein sequence design using ProteinMPNN
title_sort robust deep learning based protein sequence design using proteinmpnn
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9997061/
https://www.ncbi.nlm.nih.gov/pubmed/36108050
http://dx.doi.org/10.1126/science.add2187
work_keys_str_mv AT dauparasj robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT anishchenkoi robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT bennettn robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT baih robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT ragotterj robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT milleslf robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT wickybim robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT courbeta robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT dehaasrj robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT betheln robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT leungpjy robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT huddytf robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT pellocks robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT tischerd robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT chanf robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT koepnickb robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT nguyenh robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT kanga robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT sankaranb robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT beraak robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT kingnp robustdeeplearningbasedproteinsequencedesignusingproteinmpnn
AT bakerd robustdeeplearningbasedproteinsequencedesignusingproteinmpnn