Cargando…

Analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures

In previous work, we have assessed the structural similarities between marketed drugs (‘drugs’) and endogenous natural human metabolites (‘metabolites’ or ‘endogenites’), using ‘fingerprint’ methods in common use, and the Tanimoto and Tversky similarity metrics, finding that the fingerprint encoding...

Descripción completa

Detalles Bibliográficos
Autores principales: O’Hagan, Steve, Kell, Douglas B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5344883/
https://www.ncbi.nlm.nih.gov/pubmed/28316656
http://dx.doi.org/10.1186/s13321-017-0198-y
_version_ 1782513605039620096
author O’Hagan, Steve
Kell, Douglas B.
author_facet O’Hagan, Steve
Kell, Douglas B.
author_sort O’Hagan, Steve
collection PubMed
description In previous work, we have assessed the structural similarities between marketed drugs (‘drugs’) and endogenous natural human metabolites (‘metabolites’ or ‘endogenites’), using ‘fingerprint’ methods in common use, and the Tanimoto and Tversky similarity metrics, finding that the fingerprint encoding used had a dramatic effect on the apparent similarities observed. By contrast, the maximal common substructure (MCS), when the means of determining it is fixed, is a means of determining similarities that is largely independent of the fingerprints, and also has a clear chemical meaning. We here explored the utility of the MCS and metrics derived therefrom. In many cases, a shared scaffold helps cluster drugs and endogenites, and gives insight into enzymes (in particular transporters) that they both share. Tanimoto and Tversky similarities based on the MCS tend to be smaller than those based on the MACCS fingerprint-type encoding, though the converse is also true for a significant fraction of the comparisons. While no single molecular descriptor can account for these differences, a machine learning-based analysis of the nature of the differences (MACCS_Tanimoto vs MCS_Tversky) shows that they are indeed deterministic, although the features that are used in the model to account for this vary greatly with each individual drug. The extent of its utility and interpretability vary with the drug of interest, implying that while MCS is neither ‘better’ nor ‘worse’ for every drug–endogenite comparison, it is sufficiently different to be of value. The overall conclusion is thus that the use of the MCS provides an additional and valuable strategy for understanding the structural basis for similarities between synthetic, marketed drugs and natural intermediary metabolites. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-017-0198-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5344883
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-53448832017-03-17 Analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures O’Hagan, Steve Kell, Douglas B. J Cheminform Research Article In previous work, we have assessed the structural similarities between marketed drugs (‘drugs’) and endogenous natural human metabolites (‘metabolites’ or ‘endogenites’), using ‘fingerprint’ methods in common use, and the Tanimoto and Tversky similarity metrics, finding that the fingerprint encoding used had a dramatic effect on the apparent similarities observed. By contrast, the maximal common substructure (MCS), when the means of determining it is fixed, is a means of determining similarities that is largely independent of the fingerprints, and also has a clear chemical meaning. We here explored the utility of the MCS and metrics derived therefrom. In many cases, a shared scaffold helps cluster drugs and endogenites, and gives insight into enzymes (in particular transporters) that they both share. Tanimoto and Tversky similarities based on the MCS tend to be smaller than those based on the MACCS fingerprint-type encoding, though the converse is also true for a significant fraction of the comparisons. While no single molecular descriptor can account for these differences, a machine learning-based analysis of the nature of the differences (MACCS_Tanimoto vs MCS_Tversky) shows that they are indeed deterministic, although the features that are used in the model to account for this vary greatly with each individual drug. The extent of its utility and interpretability vary with the drug of interest, implying that while MCS is neither ‘better’ nor ‘worse’ for every drug–endogenite comparison, it is sufficiently different to be of value. The overall conclusion is thus that the use of the MCS provides an additional and valuable strategy for understanding the structural basis for similarities between synthetic, marketed drugs and natural intermediary metabolites. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-017-0198-y) contains supplementary material, which is available to authorized users. Springer International Publishing 2017-03-09 /pmc/articles/PMC5344883/ /pubmed/28316656 http://dx.doi.org/10.1186/s13321-017-0198-y Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
O’Hagan, Steve
Kell, Douglas B.
Analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures
title Analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures
title_full Analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures
title_fullStr Analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures
title_full_unstemmed Analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures
title_short Analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures
title_sort analysis of drug–endogenous human metabolite similarities in terms of their maximum common substructures
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5344883/
https://www.ncbi.nlm.nih.gov/pubmed/28316656
http://dx.doi.org/10.1186/s13321-017-0198-y
work_keys_str_mv AT ohagansteve analysisofdrugendogenoushumanmetabolitesimilaritiesintermsoftheirmaximumcommonsubstructures
AT kelldouglasb analysisofdrugendogenoushumanmetabolitesimilaritiesintermsoftheirmaximumcommonsubstructures