Cargando…

ChEMBL-Likeness Score and Database GDBChEMBL

The generated database GDB17 enumerates 166.4 billion molecules up to 17 atoms of C, N, O, S and halogens following simple rules of chemical stability and synthetic feasibility. However, most molecules in GDB17 are too complex to be considered for chemical synthesis. To address this limitation, we r...

Descripción completa

Detalles Bibliográficos
Autores principales: Bühlmann, Sven, Reymond, Jean-Louis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7010641/
https://www.ncbi.nlm.nih.gov/pubmed/32117874
http://dx.doi.org/10.3389/fchem.2020.00046
_version_ 1783495908299636736
author Bühlmann, Sven
Reymond, Jean-Louis
author_facet Bühlmann, Sven
Reymond, Jean-Louis
author_sort Bühlmann, Sven
collection PubMed
description The generated database GDB17 enumerates 166.4 billion molecules up to 17 atoms of C, N, O, S and halogens following simple rules of chemical stability and synthetic feasibility. However, most molecules in GDB17 are too complex to be considered for chemical synthesis. To address this limitation, we report GDBChEMBL as a subset of GDB17 featuring 10 million molecules selected according to a ChEMBL-likeness score (CLscore) calculated from the frequency of occurrence of circular substructures in ChEMBL, followed by uniform sampling across molecular size, stereocenters and heteroatoms. Compared to the previously reported subsets FDB17 and GDBMedChem selected from GDB17 by fragment-likeness, respectively, medicinal chemistry criteria, our new subset features molecules with higher synthetic accessibility and possibly bioactivity yet retains a broad and continuous coverage of chemical space typical of the entire GDB17. GDBChEMBL is accessible at http://gdb.unibe.ch for download and for browsing using an interactive chemical space map at http://faerun.gdb.tools.
format Online
Article
Text
id pubmed-7010641
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-70106412020-02-28 ChEMBL-Likeness Score and Database GDBChEMBL Bühlmann, Sven Reymond, Jean-Louis Front Chem Chemistry The generated database GDB17 enumerates 166.4 billion molecules up to 17 atoms of C, N, O, S and halogens following simple rules of chemical stability and synthetic feasibility. However, most molecules in GDB17 are too complex to be considered for chemical synthesis. To address this limitation, we report GDBChEMBL as a subset of GDB17 featuring 10 million molecules selected according to a ChEMBL-likeness score (CLscore) calculated from the frequency of occurrence of circular substructures in ChEMBL, followed by uniform sampling across molecular size, stereocenters and heteroatoms. Compared to the previously reported subsets FDB17 and GDBMedChem selected from GDB17 by fragment-likeness, respectively, medicinal chemistry criteria, our new subset features molecules with higher synthetic accessibility and possibly bioactivity yet retains a broad and continuous coverage of chemical space typical of the entire GDB17. GDBChEMBL is accessible at http://gdb.unibe.ch for download and for browsing using an interactive chemical space map at http://faerun.gdb.tools. Frontiers Media S.A. 2020-02-04 /pmc/articles/PMC7010641/ /pubmed/32117874 http://dx.doi.org/10.3389/fchem.2020.00046 Text en Copyright © 2020 Bühlmann and Reymond. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Chemistry
Bühlmann, Sven
Reymond, Jean-Louis
ChEMBL-Likeness Score and Database GDBChEMBL
title ChEMBL-Likeness Score and Database GDBChEMBL
title_full ChEMBL-Likeness Score and Database GDBChEMBL
title_fullStr ChEMBL-Likeness Score and Database GDBChEMBL
title_full_unstemmed ChEMBL-Likeness Score and Database GDBChEMBL
title_short ChEMBL-Likeness Score and Database GDBChEMBL
title_sort chembl-likeness score and database gdbchembl
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7010641/
https://www.ncbi.nlm.nih.gov/pubmed/32117874
http://dx.doi.org/10.3389/fchem.2020.00046
work_keys_str_mv AT buhlmannsven chembllikenessscoreanddatabasegdbchembl
AT reymondjeanlouis chembllikenessscoreanddatabasegdbchembl