Cargando…

Highly accurate protein structure prediction for the human proteome

Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experi...

Descripción completa

Detalles Bibliográficos
Autores principales: Tunyasuvunakool, Kathryn, Adler, Jonas, Wu, Zachary, Green, Tim, Zielinski, Michal, Žídek, Augustin, Bridgland, Alex, Cowie, Andrew, Meyer, Clemens, Laydon, Agata, Velankar, Sameer, Kleywegt, Gerard J., Bateman, Alex, Evans, Richard, Pritzel, Alexander, Figurnov, Michael, Ronneberger, Olaf, Bates, Russ, Kohl, Simon A. A., Potapenko, Anna, Ballard, Andrew J., Romera-Paredes, Bernardino, Nikolov, Stanislav, Jain, Rishub, Clancy, Ellen, Reiman, David, Petersen, Stig, Senior, Andrew W., Kavukcuoglu, Koray, Birney, Ewan, Kohli, Pushmeet, Jumper, John, Hassabis, Demis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8387240/
https://www.ncbi.nlm.nih.gov/pubmed/34293799
http://dx.doi.org/10.1038/s41586-021-03828-1
_version_ 1783742421446688768
author Tunyasuvunakool, Kathryn
Adler, Jonas
Wu, Zachary
Green, Tim
Zielinski, Michal
Žídek, Augustin
Bridgland, Alex
Cowie, Andrew
Meyer, Clemens
Laydon, Agata
Velankar, Sameer
Kleywegt, Gerard J.
Bateman, Alex
Evans, Richard
Pritzel, Alexander
Figurnov, Michael
Ronneberger, Olaf
Bates, Russ
Kohl, Simon A. A.
Potapenko, Anna
Ballard, Andrew J.
Romera-Paredes, Bernardino
Nikolov, Stanislav
Jain, Rishub
Clancy, Ellen
Reiman, David
Petersen, Stig
Senior, Andrew W.
Kavukcuoglu, Koray
Birney, Ewan
Kohli, Pushmeet
Jumper, John
Hassabis, Demis
author_facet Tunyasuvunakool, Kathryn
Adler, Jonas
Wu, Zachary
Green, Tim
Zielinski, Michal
Žídek, Augustin
Bridgland, Alex
Cowie, Andrew
Meyer, Clemens
Laydon, Agata
Velankar, Sameer
Kleywegt, Gerard J.
Bateman, Alex
Evans, Richard
Pritzel, Alexander
Figurnov, Michael
Ronneberger, Olaf
Bates, Russ
Kohl, Simon A. A.
Potapenko, Anna
Ballard, Andrew J.
Romera-Paredes, Bernardino
Nikolov, Stanislav
Jain, Rishub
Clancy, Ellen
Reiman, David
Petersen, Stig
Senior, Andrew W.
Kavukcuoglu, Koray
Birney, Ewan
Kohli, Pushmeet
Jumper, John
Hassabis, Demis
author_sort Tunyasuvunakool, Kathryn
collection PubMed
description Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure(1). Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold(2), at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective.
format Online
Article
Text
id pubmed-8387240
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-83872402021-09-14 Highly accurate protein structure prediction for the human proteome Tunyasuvunakool, Kathryn Adler, Jonas Wu, Zachary Green, Tim Zielinski, Michal Žídek, Augustin Bridgland, Alex Cowie, Andrew Meyer, Clemens Laydon, Agata Velankar, Sameer Kleywegt, Gerard J. Bateman, Alex Evans, Richard Pritzel, Alexander Figurnov, Michael Ronneberger, Olaf Bates, Russ Kohl, Simon A. A. Potapenko, Anna Ballard, Andrew J. Romera-Paredes, Bernardino Nikolov, Stanislav Jain, Rishub Clancy, Ellen Reiman, David Petersen, Stig Senior, Andrew W. Kavukcuoglu, Koray Birney, Ewan Kohli, Pushmeet Jumper, John Hassabis, Demis Nature Article Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure(1). Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold(2), at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective. Nature Publishing Group UK 2021-07-22 2021 /pmc/articles/PMC8387240/ /pubmed/34293799 http://dx.doi.org/10.1038/s41586-021-03828-1 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Tunyasuvunakool, Kathryn
Adler, Jonas
Wu, Zachary
Green, Tim
Zielinski, Michal
Žídek, Augustin
Bridgland, Alex
Cowie, Andrew
Meyer, Clemens
Laydon, Agata
Velankar, Sameer
Kleywegt, Gerard J.
Bateman, Alex
Evans, Richard
Pritzel, Alexander
Figurnov, Michael
Ronneberger, Olaf
Bates, Russ
Kohl, Simon A. A.
Potapenko, Anna
Ballard, Andrew J.
Romera-Paredes, Bernardino
Nikolov, Stanislav
Jain, Rishub
Clancy, Ellen
Reiman, David
Petersen, Stig
Senior, Andrew W.
Kavukcuoglu, Koray
Birney, Ewan
Kohli, Pushmeet
Jumper, John
Hassabis, Demis
Highly accurate protein structure prediction for the human proteome
title Highly accurate protein structure prediction for the human proteome
title_full Highly accurate protein structure prediction for the human proteome
title_fullStr Highly accurate protein structure prediction for the human proteome
title_full_unstemmed Highly accurate protein structure prediction for the human proteome
title_short Highly accurate protein structure prediction for the human proteome
title_sort highly accurate protein structure prediction for the human proteome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8387240/
https://www.ncbi.nlm.nih.gov/pubmed/34293799
http://dx.doi.org/10.1038/s41586-021-03828-1
work_keys_str_mv AT tunyasuvunakoolkathryn highlyaccurateproteinstructurepredictionforthehumanproteome
AT adlerjonas highlyaccurateproteinstructurepredictionforthehumanproteome
AT wuzachary highlyaccurateproteinstructurepredictionforthehumanproteome
AT greentim highlyaccurateproteinstructurepredictionforthehumanproteome
AT zielinskimichal highlyaccurateproteinstructurepredictionforthehumanproteome
AT zidekaugustin highlyaccurateproteinstructurepredictionforthehumanproteome
AT bridglandalex highlyaccurateproteinstructurepredictionforthehumanproteome
AT cowieandrew highlyaccurateproteinstructurepredictionforthehumanproteome
AT meyerclemens highlyaccurateproteinstructurepredictionforthehumanproteome
AT laydonagata highlyaccurateproteinstructurepredictionforthehumanproteome
AT velankarsameer highlyaccurateproteinstructurepredictionforthehumanproteome
AT kleywegtgerardj highlyaccurateproteinstructurepredictionforthehumanproteome
AT batemanalex highlyaccurateproteinstructurepredictionforthehumanproteome
AT evansrichard highlyaccurateproteinstructurepredictionforthehumanproteome
AT pritzelalexander highlyaccurateproteinstructurepredictionforthehumanproteome
AT figurnovmichael highlyaccurateproteinstructurepredictionforthehumanproteome
AT ronnebergerolaf highlyaccurateproteinstructurepredictionforthehumanproteome
AT batesruss highlyaccurateproteinstructurepredictionforthehumanproteome
AT kohlsimonaa highlyaccurateproteinstructurepredictionforthehumanproteome
AT potapenkoanna highlyaccurateproteinstructurepredictionforthehumanproteome
AT ballardandrewj highlyaccurateproteinstructurepredictionforthehumanproteome
AT romeraparedesbernardino highlyaccurateproteinstructurepredictionforthehumanproteome
AT nikolovstanislav highlyaccurateproteinstructurepredictionforthehumanproteome
AT jainrishub highlyaccurateproteinstructurepredictionforthehumanproteome
AT clancyellen highlyaccurateproteinstructurepredictionforthehumanproteome
AT reimandavid highlyaccurateproteinstructurepredictionforthehumanproteome
AT petersenstig highlyaccurateproteinstructurepredictionforthehumanproteome
AT seniorandreww highlyaccurateproteinstructurepredictionforthehumanproteome
AT kavukcuoglukoray highlyaccurateproteinstructurepredictionforthehumanproteome
AT birneyewan highlyaccurateproteinstructurepredictionforthehumanproteome
AT kohlipushmeet highlyaccurateproteinstructurepredictionforthehumanproteome
AT jumperjohn highlyaccurateproteinstructurepredictionforthehumanproteome
AT hassabisdemis highlyaccurateproteinstructurepredictionforthehumanproteome