Cargando…

Machine learning prediction of 3CL(pro) SARS-CoV-2 docking scores

Molecular docking results of two training sets containing 866 and 8,696 compounds were used to train three different machine learning (ML) approaches. Neural network approaches according to Keras and TensorFlow libraries and the gradient boosted decision trees approach of XGBoost were used with DScr...

Descripción completa

Detalles Bibliográficos
Autores principales: Bucinsky, Lukas, Bortňák, Dušan, Gall, Marián, Matúška, Ján, Milata, Viktor, Pitoňák, Michal, Štekláč, Marek, Végh, Daniel, Zajaček, Dávid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8881816/
https://www.ncbi.nlm.nih.gov/pubmed/35288359
http://dx.doi.org/10.1016/j.compbiolchem.2022.107656
_version_ 1784659562218913792
author Bucinsky, Lukas
Bortňák, Dušan
Gall, Marián
Matúška, Ján
Milata, Viktor
Pitoňák, Michal
Štekláč, Marek
Végh, Daniel
Zajaček, Dávid
author_facet Bucinsky, Lukas
Bortňák, Dušan
Gall, Marián
Matúška, Ján
Milata, Viktor
Pitoňák, Michal
Štekláč, Marek
Végh, Daniel
Zajaček, Dávid
author_sort Bucinsky, Lukas
collection PubMed
description Molecular docking results of two training sets containing 866 and 8,696 compounds were used to train three different machine learning (ML) approaches. Neural network approaches according to Keras and TensorFlow libraries and the gradient boosted decision trees approach of XGBoost were used with DScribe’s Smooth Overlap of Atomic Positions molecular descriptors. In addition, neural networks using the SchNetPack library and descriptors were used. The ML performance was tested on three different sets, including compounds for future organic synthesis. The final evaluation of the ML predicted docking scores was based on the ZINC in vivo set, from which 1,200 compounds were randomly selected with respect to their size. The results obtained showed a consistent ML prediction capability of docking scores, and even though compounds with more than 60 atoms were found slightly overestimated they remain valid for a subsequent evaluation of their drug repurposing suitability.
format Online
Article
Text
id pubmed-8881816
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-88818162022-02-28 Machine learning prediction of 3CL(pro) SARS-CoV-2 docking scores Bucinsky, Lukas Bortňák, Dušan Gall, Marián Matúška, Ján Milata, Viktor Pitoňák, Michal Štekláč, Marek Végh, Daniel Zajaček, Dávid Comput Biol Chem Article Molecular docking results of two training sets containing 866 and 8,696 compounds were used to train three different machine learning (ML) approaches. Neural network approaches according to Keras and TensorFlow libraries and the gradient boosted decision trees approach of XGBoost were used with DScribe’s Smooth Overlap of Atomic Positions molecular descriptors. In addition, neural networks using the SchNetPack library and descriptors were used. The ML performance was tested on three different sets, including compounds for future organic synthesis. The final evaluation of the ML predicted docking scores was based on the ZINC in vivo set, from which 1,200 compounds were randomly selected with respect to their size. The results obtained showed a consistent ML prediction capability of docking scores, and even though compounds with more than 60 atoms were found slightly overestimated they remain valid for a subsequent evaluation of their drug repurposing suitability. Elsevier Ltd. 2022-06 2022-02-26 /pmc/articles/PMC8881816/ /pubmed/35288359 http://dx.doi.org/10.1016/j.compbiolchem.2022.107656 Text en © 2022 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Bucinsky, Lukas
Bortňák, Dušan
Gall, Marián
Matúška, Ján
Milata, Viktor
Pitoňák, Michal
Štekláč, Marek
Végh, Daniel
Zajaček, Dávid
Machine learning prediction of 3CL(pro) SARS-CoV-2 docking scores
title Machine learning prediction of 3CL(pro) SARS-CoV-2 docking scores
title_full Machine learning prediction of 3CL(pro) SARS-CoV-2 docking scores
title_fullStr Machine learning prediction of 3CL(pro) SARS-CoV-2 docking scores
title_full_unstemmed Machine learning prediction of 3CL(pro) SARS-CoV-2 docking scores
title_short Machine learning prediction of 3CL(pro) SARS-CoV-2 docking scores
title_sort machine learning prediction of 3cl(pro) sars-cov-2 docking scores
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8881816/
https://www.ncbi.nlm.nih.gov/pubmed/35288359
http://dx.doi.org/10.1016/j.compbiolchem.2022.107656
work_keys_str_mv AT bucinskylukas machinelearningpredictionof3clprosarscov2dockingscores
AT bortnakdusan machinelearningpredictionof3clprosarscov2dockingscores
AT gallmarian machinelearningpredictionof3clprosarscov2dockingscores
AT matuskajan machinelearningpredictionof3clprosarscov2dockingscores
AT milataviktor machinelearningpredictionof3clprosarscov2dockingscores
AT pitonakmichal machinelearningpredictionof3clprosarscov2dockingscores
AT steklacmarek machinelearningpredictionof3clprosarscov2dockingscores
AT veghdaniel machinelearningpredictionof3clprosarscov2dockingscores
AT zajacekdavid machinelearningpredictionof3clprosarscov2dockingscores