Cargando…
Highly accurate protein structure prediction for the human proteome
Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experi...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8387240/ https://www.ncbi.nlm.nih.gov/pubmed/34293799 http://dx.doi.org/10.1038/s41586-021-03828-1 |
_version_ | 1783742421446688768 |
---|---|
author | Tunyasuvunakool, Kathryn Adler, Jonas Wu, Zachary Green, Tim Zielinski, Michal Žídek, Augustin Bridgland, Alex Cowie, Andrew Meyer, Clemens Laydon, Agata Velankar, Sameer Kleywegt, Gerard J. Bateman, Alex Evans, Richard Pritzel, Alexander Figurnov, Michael Ronneberger, Olaf Bates, Russ Kohl, Simon A. A. Potapenko, Anna Ballard, Andrew J. Romera-Paredes, Bernardino Nikolov, Stanislav Jain, Rishub Clancy, Ellen Reiman, David Petersen, Stig Senior, Andrew W. Kavukcuoglu, Koray Birney, Ewan Kohli, Pushmeet Jumper, John Hassabis, Demis |
author_facet | Tunyasuvunakool, Kathryn Adler, Jonas Wu, Zachary Green, Tim Zielinski, Michal Žídek, Augustin Bridgland, Alex Cowie, Andrew Meyer, Clemens Laydon, Agata Velankar, Sameer Kleywegt, Gerard J. Bateman, Alex Evans, Richard Pritzel, Alexander Figurnov, Michael Ronneberger, Olaf Bates, Russ Kohl, Simon A. A. Potapenko, Anna Ballard, Andrew J. Romera-Paredes, Bernardino Nikolov, Stanislav Jain, Rishub Clancy, Ellen Reiman, David Petersen, Stig Senior, Andrew W. Kavukcuoglu, Koray Birney, Ewan Kohli, Pushmeet Jumper, John Hassabis, Demis |
author_sort | Tunyasuvunakool, Kathryn |
collection | PubMed |
description | Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure(1). Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold(2), at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective. |
format | Online Article Text |
id | pubmed-8387240 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-83872402021-09-14 Highly accurate protein structure prediction for the human proteome Tunyasuvunakool, Kathryn Adler, Jonas Wu, Zachary Green, Tim Zielinski, Michal Žídek, Augustin Bridgland, Alex Cowie, Andrew Meyer, Clemens Laydon, Agata Velankar, Sameer Kleywegt, Gerard J. Bateman, Alex Evans, Richard Pritzel, Alexander Figurnov, Michael Ronneberger, Olaf Bates, Russ Kohl, Simon A. A. Potapenko, Anna Ballard, Andrew J. Romera-Paredes, Bernardino Nikolov, Stanislav Jain, Rishub Clancy, Ellen Reiman, David Petersen, Stig Senior, Andrew W. Kavukcuoglu, Koray Birney, Ewan Kohli, Pushmeet Jumper, John Hassabis, Demis Nature Article Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure(1). Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold(2), at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective. Nature Publishing Group UK 2021-07-22 2021 /pmc/articles/PMC8387240/ /pubmed/34293799 http://dx.doi.org/10.1038/s41586-021-03828-1 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Tunyasuvunakool, Kathryn Adler, Jonas Wu, Zachary Green, Tim Zielinski, Michal Žídek, Augustin Bridgland, Alex Cowie, Andrew Meyer, Clemens Laydon, Agata Velankar, Sameer Kleywegt, Gerard J. Bateman, Alex Evans, Richard Pritzel, Alexander Figurnov, Michael Ronneberger, Olaf Bates, Russ Kohl, Simon A. A. Potapenko, Anna Ballard, Andrew J. Romera-Paredes, Bernardino Nikolov, Stanislav Jain, Rishub Clancy, Ellen Reiman, David Petersen, Stig Senior, Andrew W. Kavukcuoglu, Koray Birney, Ewan Kohli, Pushmeet Jumper, John Hassabis, Demis Highly accurate protein structure prediction for the human proteome |
title | Highly accurate protein structure prediction for the human proteome |
title_full | Highly accurate protein structure prediction for the human proteome |
title_fullStr | Highly accurate protein structure prediction for the human proteome |
title_full_unstemmed | Highly accurate protein structure prediction for the human proteome |
title_short | Highly accurate protein structure prediction for the human proteome |
title_sort | highly accurate protein structure prediction for the human proteome |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8387240/ https://www.ncbi.nlm.nih.gov/pubmed/34293799 http://dx.doi.org/10.1038/s41586-021-03828-1 |
work_keys_str_mv | AT tunyasuvunakoolkathryn highlyaccurateproteinstructurepredictionforthehumanproteome AT adlerjonas highlyaccurateproteinstructurepredictionforthehumanproteome AT wuzachary highlyaccurateproteinstructurepredictionforthehumanproteome AT greentim highlyaccurateproteinstructurepredictionforthehumanproteome AT zielinskimichal highlyaccurateproteinstructurepredictionforthehumanproteome AT zidekaugustin highlyaccurateproteinstructurepredictionforthehumanproteome AT bridglandalex highlyaccurateproteinstructurepredictionforthehumanproteome AT cowieandrew highlyaccurateproteinstructurepredictionforthehumanproteome AT meyerclemens highlyaccurateproteinstructurepredictionforthehumanproteome AT laydonagata highlyaccurateproteinstructurepredictionforthehumanproteome AT velankarsameer highlyaccurateproteinstructurepredictionforthehumanproteome AT kleywegtgerardj highlyaccurateproteinstructurepredictionforthehumanproteome AT batemanalex highlyaccurateproteinstructurepredictionforthehumanproteome AT evansrichard highlyaccurateproteinstructurepredictionforthehumanproteome AT pritzelalexander highlyaccurateproteinstructurepredictionforthehumanproteome AT figurnovmichael highlyaccurateproteinstructurepredictionforthehumanproteome AT ronnebergerolaf highlyaccurateproteinstructurepredictionforthehumanproteome AT batesruss highlyaccurateproteinstructurepredictionforthehumanproteome AT kohlsimonaa highlyaccurateproteinstructurepredictionforthehumanproteome AT potapenkoanna highlyaccurateproteinstructurepredictionforthehumanproteome AT ballardandrewj highlyaccurateproteinstructurepredictionforthehumanproteome AT romeraparedesbernardino highlyaccurateproteinstructurepredictionforthehumanproteome AT nikolovstanislav highlyaccurateproteinstructurepredictionforthehumanproteome AT jainrishub highlyaccurateproteinstructurepredictionforthehumanproteome AT clancyellen highlyaccurateproteinstructurepredictionforthehumanproteome AT reimandavid highlyaccurateproteinstructurepredictionforthehumanproteome AT petersenstig highlyaccurateproteinstructurepredictionforthehumanproteome AT seniorandreww highlyaccurateproteinstructurepredictionforthehumanproteome AT kavukcuoglukoray highlyaccurateproteinstructurepredictionforthehumanproteome AT birneyewan highlyaccurateproteinstructurepredictionforthehumanproteome AT kohlipushmeet highlyaccurateproteinstructurepredictionforthehumanproteome AT jumperjohn highlyaccurateproteinstructurepredictionforthehumanproteome AT hassabisdemis highlyaccurateproteinstructurepredictionforthehumanproteome |