Cargando…
AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection
Protein-ligand docking is a computational method for identifying drug leads. The method is capable of narrowing a vast library of compounds down to a tractable size for downstream simulation or experimental testing and is widely used in drug discovery. While there has been progress in accelerating s...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9901402/ https://www.ncbi.nlm.nih.gov/pubmed/36747041 http://dx.doi.org/10.1038/s41598-023-28785-9 |
_version_ | 1784883023590719488 |
---|---|
author | Clyde, Austin Liu, Xuefeng Brettin, Thomas Yoo, Hyunseung Partin, Alexander Babuji, Yadu Blaiszik, Ben Mohd-Yusof, Jamaludin Merzky, Andre Turilli, Matteo Jha, Shantenu Ramanathan, Arvind Stevens, Rick |
author_facet | Clyde, Austin Liu, Xuefeng Brettin, Thomas Yoo, Hyunseung Partin, Alexander Babuji, Yadu Blaiszik, Ben Mohd-Yusof, Jamaludin Merzky, Andre Turilli, Matteo Jha, Shantenu Ramanathan, Arvind Stevens, Rick |
author_sort | Clyde, Austin |
collection | PubMed |
description | Protein-ligand docking is a computational method for identifying drug leads. The method is capable of narrowing a vast library of compounds down to a tractable size for downstream simulation or experimental testing and is widely used in drug discovery. While there has been progress in accelerating scoring of compounds with artificial intelligence, few works have bridged these successes back to the virtual screening community in terms of utility and forward-looking development. We demonstrate the power of high-speed ML models by scoring 1 billion molecules in under a day (50 k predictions per GPU seconds). We showcase a workflow for docking utilizing surrogate AI-based models as a pre-filter to a standard docking workflow. Our workflow is ten times faster at screening a library of compounds than the standard technique, with an error rate less than 0.01% of detecting the underlying best scoring 0.1% of compounds. Our analysis of the speedup explains that another order of magnitude speedup must come from model accuracy rather than computing speed. In order to drive another order of magnitude of acceleration, we share a benchmark dataset consisting of 200 million 3D complex structures and 2D structure scores across a consistent set of 13 million “in-stock” molecules over 15 receptors, or binding sites, across the SARS-CoV-2 proteome. We believe this is strong evidence for the community to begin focusing on improving the accuracy of surrogate models to improve the ability to screen massive compound libraries 100 × or even 1000 × faster than current techniques and reduce missing top hits. The technique outlined aims to be a fast drop-in replacement for docking for screening billion-scale molecular libraries. |
format | Online Article Text |
id | pubmed-9901402 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-99014022023-02-07 AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection Clyde, Austin Liu, Xuefeng Brettin, Thomas Yoo, Hyunseung Partin, Alexander Babuji, Yadu Blaiszik, Ben Mohd-Yusof, Jamaludin Merzky, Andre Turilli, Matteo Jha, Shantenu Ramanathan, Arvind Stevens, Rick Sci Rep Article Protein-ligand docking is a computational method for identifying drug leads. The method is capable of narrowing a vast library of compounds down to a tractable size for downstream simulation or experimental testing and is widely used in drug discovery. While there has been progress in accelerating scoring of compounds with artificial intelligence, few works have bridged these successes back to the virtual screening community in terms of utility and forward-looking development. We demonstrate the power of high-speed ML models by scoring 1 billion molecules in under a day (50 k predictions per GPU seconds). We showcase a workflow for docking utilizing surrogate AI-based models as a pre-filter to a standard docking workflow. Our workflow is ten times faster at screening a library of compounds than the standard technique, with an error rate less than 0.01% of detecting the underlying best scoring 0.1% of compounds. Our analysis of the speedup explains that another order of magnitude speedup must come from model accuracy rather than computing speed. In order to drive another order of magnitude of acceleration, we share a benchmark dataset consisting of 200 million 3D complex structures and 2D structure scores across a consistent set of 13 million “in-stock” molecules over 15 receptors, or binding sites, across the SARS-CoV-2 proteome. We believe this is strong evidence for the community to begin focusing on improving the accuracy of surrogate models to improve the ability to screen massive compound libraries 100 × or even 1000 × faster than current techniques and reduce missing top hits. The technique outlined aims to be a fast drop-in replacement for docking for screening billion-scale molecular libraries. Nature Publishing Group UK 2023-02-06 /pmc/articles/PMC9901402/ /pubmed/36747041 http://dx.doi.org/10.1038/s41598-023-28785-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Clyde, Austin Liu, Xuefeng Brettin, Thomas Yoo, Hyunseung Partin, Alexander Babuji, Yadu Blaiszik, Ben Mohd-Yusof, Jamaludin Merzky, Andre Turilli, Matteo Jha, Shantenu Ramanathan, Arvind Stevens, Rick AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection |
title | AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection |
title_full | AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection |
title_fullStr | AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection |
title_full_unstemmed | AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection |
title_short | AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection |
title_sort | ai-accelerated protein-ligand docking for sars-cov-2 is 100-fold faster with no significant change in detection |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9901402/ https://www.ncbi.nlm.nih.gov/pubmed/36747041 http://dx.doi.org/10.1038/s41598-023-28785-9 |
work_keys_str_mv | AT clydeaustin aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT liuxuefeng aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT brettinthomas aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT yoohyunseung aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT partinalexander aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT babujiyadu aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT blaiszikben aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT mohdyusofjamaludin aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT merzkyandre aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT turillimatteo aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT jhashantenu aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT ramanathanarvind aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection AT stevensrick aiacceleratedproteinliganddockingforsarscov2is100foldfasterwithnosignificantchangeindetection |