Cargando…

A general approach for retrosynthetic molecular core analysis

Scaffold analysis of compound data sets has reemerged as a chemically interpretable alternative to machine learning for chemical space and structure–activity relationships analysis. In this context, analog series-based scaffolds (ASBS) are synthetically relevant core structures that represent indivi...

Descripción completa

Detalles Bibliográficos
Autores principales: Naveja, J. Jesús, Pilón-Jiménez, B. Angélica, Bajorath, Jürgen, Medina-Franco, José L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6760108/
https://www.ncbi.nlm.nih.gov/pubmed/33430974
http://dx.doi.org/10.1186/s13321-019-0380-5
_version_ 1783453812505182208
author Naveja, J. Jesús
Pilón-Jiménez, B. Angélica
Bajorath, Jürgen
Medina-Franco, José L.
author_facet Naveja, J. Jesús
Pilón-Jiménez, B. Angélica
Bajorath, Jürgen
Medina-Franco, José L.
author_sort Naveja, J. Jesús
collection PubMed
description Scaffold analysis of compound data sets has reemerged as a chemically interpretable alternative to machine learning for chemical space and structure–activity relationships analysis. In this context, analog series-based scaffolds (ASBS) are synthetically relevant core structures that represent individual series of analogs. As an extension to ASBS, we herein introduce the development of a general conceptual framework that considers all putative cores of molecules in a compound data set, thus softening the often applied “single molecule–single scaffold” correspondence. A putative core is here defined as any substructure of a molecule complying with two basic rules: (a) the size of the core is a significant proportion of the whole molecule size and (b) the substructure can be reached from the original molecule through a succession of retrosynthesis rules. Thereafter, a bipartite network consisting of molecules and cores can be constructed for a database of chemical structures. Compounds linked to the same cores are considered analogs. We present case studies illustrating the potential of the general framework. The applications range from inter- and intra-core diversity analysis of compound data sets, structure–property relationships, and identification of analog series and ASBS. The molecule–core network herein presented is a general methodology with multiple applications in scaffold analysis. New statistical methods are envisioned that will be able to draw quantitative conclusions from these data. The code to use the method presented in this work is freely available as an additional file. Follow-up applications include analog searching and core structure–property relationships analyses. [Image: see text]
format Online
Article
Text
id pubmed-6760108
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-67601082019-09-30 A general approach for retrosynthetic molecular core analysis Naveja, J. Jesús Pilón-Jiménez, B. Angélica Bajorath, Jürgen Medina-Franco, José L. J Cheminform Methodology Scaffold analysis of compound data sets has reemerged as a chemically interpretable alternative to machine learning for chemical space and structure–activity relationships analysis. In this context, analog series-based scaffolds (ASBS) are synthetically relevant core structures that represent individual series of analogs. As an extension to ASBS, we herein introduce the development of a general conceptual framework that considers all putative cores of molecules in a compound data set, thus softening the often applied “single molecule–single scaffold” correspondence. A putative core is here defined as any substructure of a molecule complying with two basic rules: (a) the size of the core is a significant proportion of the whole molecule size and (b) the substructure can be reached from the original molecule through a succession of retrosynthesis rules. Thereafter, a bipartite network consisting of molecules and cores can be constructed for a database of chemical structures. Compounds linked to the same cores are considered analogs. We present case studies illustrating the potential of the general framework. The applications range from inter- and intra-core diversity analysis of compound data sets, structure–property relationships, and identification of analog series and ASBS. The molecule–core network herein presented is a general methodology with multiple applications in scaffold analysis. New statistical methods are envisioned that will be able to draw quantitative conclusions from these data. The code to use the method presented in this work is freely available as an additional file. Follow-up applications include analog searching and core structure–property relationships analyses. [Image: see text] Springer International Publishing 2019-09-24 /pmc/articles/PMC6760108/ /pubmed/33430974 http://dx.doi.org/10.1186/s13321-019-0380-5 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Naveja, J. Jesús
Pilón-Jiménez, B. Angélica
Bajorath, Jürgen
Medina-Franco, José L.
A general approach for retrosynthetic molecular core analysis
title A general approach for retrosynthetic molecular core analysis
title_full A general approach for retrosynthetic molecular core analysis
title_fullStr A general approach for retrosynthetic molecular core analysis
title_full_unstemmed A general approach for retrosynthetic molecular core analysis
title_short A general approach for retrosynthetic molecular core analysis
title_sort general approach for retrosynthetic molecular core analysis
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6760108/
https://www.ncbi.nlm.nih.gov/pubmed/33430974
http://dx.doi.org/10.1186/s13321-019-0380-5
work_keys_str_mv AT navejajjesus ageneralapproachforretrosyntheticmolecularcoreanalysis
AT pilonjimenezbangelica ageneralapproachforretrosyntheticmolecularcoreanalysis
AT bajorathjurgen ageneralapproachforretrosyntheticmolecularcoreanalysis
AT medinafrancojosel ageneralapproachforretrosyntheticmolecularcoreanalysis
AT navejajjesus generalapproachforretrosyntheticmolecularcoreanalysis
AT pilonjimenezbangelica generalapproachforretrosyntheticmolecularcoreanalysis
AT bajorathjurgen generalapproachforretrosyntheticmolecularcoreanalysis
AT medinafrancojosel generalapproachforretrosyntheticmolecularcoreanalysis