Cargando…

Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery

Computational methods with affordable computational resources are highly desirable for identifying active drug leads from millions of compounds. This requires a model that is both highly efficient and relatively accurate, which cannot be achieved by most of the current methods. In real virtual scree...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Haiping, Lin, Xiao, Wei, Yanjie, Zhang, Huiling, Liao, Linbu, Wu, Hao, Pan, Yi, Wu, Xuli
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Molecular Biosciences
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200220/ https://www.ncbi.nlm.nih.gov/pubmed/35720125 http://dx.doi.org/10.3389/fmolb.2022.872086

_version_	1784728017664212992
author	Zhang, Haiping Lin, Xiao Wei, Yanjie Zhang, Huiling Liao, Linbu Wu, Hao Pan, Yi Wu, Xuli
author_facet	Zhang, Haiping Lin, Xiao Wei, Yanjie Zhang, Huiling Liao, Linbu Wu, Hao Pan, Yi Wu, Xuli
author_sort	Zhang, Haiping
collection	PubMed
description	Computational methods with affordable computational resources are highly desirable for identifying active drug leads from millions of compounds. This requires a model that is both highly efficient and relatively accurate, which cannot be achieved by most of the current methods. In real virtual screening (VS) application scenarios, the desired method should perform much better in selecting active compounds by prediction than by random chance. Here, we systematically evaluate the performance of our previously developed DFCNN model in large-scale virtual screening, and the results show our method has approximately 22 times the success rate compared to the random chance on average with a score cutoff of 0.99. Of the 102 test cases, 10 cases have more than 98 times the success rate of a random guess. Interestingly, in three cases, the prediction success rate is 99 times that of a random guess by a score cutoff of 0.99. This indicates that in most situations after our extremely large-scale VS, the dataset can be reduced 20 to 100 times for the next step of virtual screening based on docking or MD simulation. Furthermore, we have employed an experimental method to verify our computational method by finding several activity inhibitors for Trypsin I Protease. In addition, we also show its proof-of-concept application in de novo drug screening. The results indicate the massive potential of this method in the first step of the real drug development workflow. Moreover, DFCNN only takes about 0.0000225s for one protein–compound prediction on average with 80 Intel CPU cores (2.00 GHz) and 60 GB RAM, which is at least tens of thousands of times faster than AutoDock Vina or Schrödinger high-throughput virtual screening. Additionally, an online webserver based on DFCNN for large-scale screening is available at http://cbblab.siat.ac.cn/DFCNN/index.php for the convenience of the users.
format	Online Article Text
id	pubmed-9200220
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-92002202022-06-16 Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery Zhang, Haiping Lin, Xiao Wei, Yanjie Zhang, Huiling Liao, Linbu Wu, Hao Pan, Yi Wu, Xuli Front Mol Biosci Molecular Biosciences Computational methods with affordable computational resources are highly desirable for identifying active drug leads from millions of compounds. This requires a model that is both highly efficient and relatively accurate, which cannot be achieved by most of the current methods. In real virtual screening (VS) application scenarios, the desired method should perform much better in selecting active compounds by prediction than by random chance. Here, we systematically evaluate the performance of our previously developed DFCNN model in large-scale virtual screening, and the results show our method has approximately 22 times the success rate compared to the random chance on average with a score cutoff of 0.99. Of the 102 test cases, 10 cases have more than 98 times the success rate of a random guess. Interestingly, in three cases, the prediction success rate is 99 times that of a random guess by a score cutoff of 0.99. This indicates that in most situations after our extremely large-scale VS, the dataset can be reduced 20 to 100 times for the next step of virtual screening based on docking or MD simulation. Furthermore, we have employed an experimental method to verify our computational method by finding several activity inhibitors for Trypsin I Protease. In addition, we also show its proof-of-concept application in de novo drug screening. The results indicate the massive potential of this method in the first step of the real drug development workflow. Moreover, DFCNN only takes about 0.0000225s for one protein–compound prediction on average with 80 Intel CPU cores (2.00 GHz) and 60 GB RAM, which is at least tens of thousands of times faster than AutoDock Vina or Schrödinger high-throughput virtual screening. Additionally, an online webserver based on DFCNN for large-scale screening is available at http://cbblab.siat.ac.cn/DFCNN/index.php for the convenience of the users. Frontiers Media S.A. 2022-06-01 /pmc/articles/PMC9200220/ /pubmed/35720125 http://dx.doi.org/10.3389/fmolb.2022.872086 Text en Copyright © 2022 Zhang, Lin, Wei, Zhang, Liao, Wu, Pan and Wu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Molecular Biosciences Zhang, Haiping Lin, Xiao Wei, Yanjie Zhang, Huiling Liao, Linbu Wu, Hao Pan, Yi Wu, Xuli Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery
title	Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery
title_full	Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery
title_fullStr	Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery
title_full_unstemmed	Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery
title_short	Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery
title_sort	validation of deep learning-based dfcnn in extremely large-scale virtual screening and application in trypsin i protease inhibitor discovery
topic	Molecular Biosciences
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200220/ https://www.ncbi.nlm.nih.gov/pubmed/35720125 http://dx.doi.org/10.3389/fmolb.2022.872086
work_keys_str_mv	AT zhanghaiping validationofdeeplearningbaseddfcnninextremelylargescalevirtualscreeningandapplicationintrypsiniproteaseinhibitordiscovery AT linxiao validationofdeeplearningbaseddfcnninextremelylargescalevirtualscreeningandapplicationintrypsiniproteaseinhibitordiscovery AT weiyanjie validationofdeeplearningbaseddfcnninextremelylargescalevirtualscreeningandapplicationintrypsiniproteaseinhibitordiscovery AT zhanghuiling validationofdeeplearningbaseddfcnninextremelylargescalevirtualscreeningandapplicationintrypsiniproteaseinhibitordiscovery AT liaolinbu validationofdeeplearningbaseddfcnninextremelylargescalevirtualscreeningandapplicationintrypsiniproteaseinhibitordiscovery AT wuhao validationofdeeplearningbaseddfcnninextremelylargescalevirtualscreeningandapplicationintrypsiniproteaseinhibitordiscovery AT panyi validationofdeeplearningbaseddfcnninextremelylargescalevirtualscreeningandapplicationintrypsiniproteaseinhibitordiscovery AT wuxuli validationofdeeplearningbaseddfcnninextremelylargescalevirtualscreeningandapplicationintrypsiniproteaseinhibitordiscovery

Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery

Ejemplares similares