Cargando…

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models i...

Descripción completa

Detalles Bibliográficos
Autores principales: Polykovskiy, Daniil, Zhebrak, Alexander, Sanchez-Lengeling, Benjamin, Golovanov, Sergey, Tatanov, Oktai, Belyaev, Stanislav, Kurbanov, Rauf, Artamonov, Aleksey, Aladinskiy, Vladimir, Veselov, Mark, Kadurin, Artur, Johansson, Simon, Chen, Hongming, Nikolenko, Sergey, Aspuru-Guzik, Alán, Zhavoronkov, Alex
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7775580/
https://www.ncbi.nlm.nih.gov/pubmed/33390943
http://dx.doi.org/10.3389/fphar.2020.565644
_version_ 1783630499822960640
author Polykovskiy, Daniil
Zhebrak, Alexander
Sanchez-Lengeling, Benjamin
Golovanov, Sergey
Tatanov, Oktai
Belyaev, Stanislav
Kurbanov, Rauf
Artamonov, Aleksey
Aladinskiy, Vladimir
Veselov, Mark
Kadurin, Artur
Johansson, Simon
Chen, Hongming
Nikolenko, Sergey
Aspuru-Guzik, Alán
Zhavoronkov, Alex
author_facet Polykovskiy, Daniil
Zhebrak, Alexander
Sanchez-Lengeling, Benjamin
Golovanov, Sergey
Tatanov, Oktai
Belyaev, Stanislav
Kurbanov, Rauf
Artamonov, Aleksey
Aladinskiy, Vladimir
Veselov, Mark
Kadurin, Artur
Johansson, Simon
Chen, Hongming
Nikolenko, Sergey
Aspuru-Guzik, Alán
Zhavoronkov, Alex
author_sort Polykovskiy, Daniil
collection PubMed
description Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare and rank them. In this work, we introduce a benchmarking platform called Molecular Sets (MOSES) to standardize training and comparison of molecular generative models. MOSES provides training and testing datasets, and a set of metrics to evaluate the quality and diversity of generated structures. We have implemented and compared several molecular generation models and suggest to use our results as reference points for further advancements in generative chemistry research. The platform and source code are available at https://github.com/molecularsets/moses.
format Online
Article
Text
id pubmed-7775580
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-77755802021-01-02 Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models Polykovskiy, Daniil Zhebrak, Alexander Sanchez-Lengeling, Benjamin Golovanov, Sergey Tatanov, Oktai Belyaev, Stanislav Kurbanov, Rauf Artamonov, Aleksey Aladinskiy, Vladimir Veselov, Mark Kadurin, Artur Johansson, Simon Chen, Hongming Nikolenko, Sergey Aspuru-Guzik, Alán Zhavoronkov, Alex Front Pharmacol Pharmacology Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare and rank them. In this work, we introduce a benchmarking platform called Molecular Sets (MOSES) to standardize training and comparison of molecular generative models. MOSES provides training and testing datasets, and a set of metrics to evaluate the quality and diversity of generated structures. We have implemented and compared several molecular generation models and suggest to use our results as reference points for further advancements in generative chemistry research. The platform and source code are available at https://github.com/molecularsets/moses. Frontiers Media S.A. 2020-12-18 /pmc/articles/PMC7775580/ /pubmed/33390943 http://dx.doi.org/10.3389/fphar.2020.565644 Text en Copyright © 2020 Polykovskiy, Zhebrak, Sanchez-Lengeling, Golovanov, Tatanov, Belyaev, Kurbanov, Artamonov, Aladinskiy, Veselov, Kadurin, Johansson, Chen, Nikolenko, Aspuru-Guzik and Zhavoronkov http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pharmacology
Polykovskiy, Daniil
Zhebrak, Alexander
Sanchez-Lengeling, Benjamin
Golovanov, Sergey
Tatanov, Oktai
Belyaev, Stanislav
Kurbanov, Rauf
Artamonov, Aleksey
Aladinskiy, Vladimir
Veselov, Mark
Kadurin, Artur
Johansson, Simon
Chen, Hongming
Nikolenko, Sergey
Aspuru-Guzik, Alán
Zhavoronkov, Alex
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
title Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
title_full Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
title_fullStr Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
title_full_unstemmed Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
title_short Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
title_sort molecular sets (moses): a benchmarking platform for molecular generation models
topic Pharmacology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7775580/
https://www.ncbi.nlm.nih.gov/pubmed/33390943
http://dx.doi.org/10.3389/fphar.2020.565644
work_keys_str_mv AT polykovskiydaniil molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT zhebrakalexander molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT sanchezlengelingbenjamin molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT golovanovsergey molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT tatanovoktai molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT belyaevstanislav molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT kurbanovrauf molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT artamonovaleksey molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT aladinskiyvladimir molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT veselovmark molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT kadurinartur molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT johanssonsimon molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT chenhongming molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT nikolenkosergey molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT aspuruguzikalan molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels
AT zhavoronkovalex molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels