Cargando…

Consensus Diversity Plots: a global diversity analysis of chemical libraries

BACKGROUND: Measuring the structural diversity of compound databases is relevant in drug discovery and many other areas of chemistry. Since molecular diversity depends on molecular representation, comprehensive chemoinformatic analysis of the diversity of libraries uses multiple criteria. For instan...

Descripción completa

Detalles Bibliográficos
Autores principales: González-Medina, Mariana, Prieto-Martínez, Fernando D., Owen, John R., Medina-Franco, José L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5105260/
https://www.ncbi.nlm.nih.gov/pubmed/27895718
http://dx.doi.org/10.1186/s13321-016-0176-9
_version_ 1782466868908392448
author González-Medina, Mariana
Prieto-Martínez, Fernando D.
Owen, John R.
Medina-Franco, José L.
author_facet González-Medina, Mariana
Prieto-Martínez, Fernando D.
Owen, John R.
Medina-Franco, José L.
author_sort González-Medina, Mariana
collection PubMed
description BACKGROUND: Measuring the structural diversity of compound databases is relevant in drug discovery and many other areas of chemistry. Since molecular diversity depends on molecular representation, comprehensive chemoinformatic analysis of the diversity of libraries uses multiple criteria. For instance, the diversity of the molecular libraries is typically evaluated employing molecular scaffolds, structural fingerprints, and physicochemical properties. However, the assessment with each criterion is analyzed independently and it is not straightforward to provide an evaluation of the “global diversity”. RESULTS: Herein the Consensus Diversity Plot (CDP) is proposed as a novel method to represent in low dimensions the diversity of chemical libraries considering simultaneously multiple molecular representations. We illustrate the application of CDPs to classify eight compound data sets and two subsets with different sizes and compositions using molecular scaffolds, structural fingerprints, and physicochemical properties. CONCLUSIONS: CDPs are general data mining tools that represent in two-dimensions the global diversity of compound data sets using multiple metrics. These plots can be constructed using single or combined measures of diversity. An online version of the CDPs is freely available at: https://consensusdiversityplots-difacquim-unam.shinyapps.io/RscriptsCDPlots/. [Figure: see text] ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-016-0176-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5105260
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-51052602016-11-28 Consensus Diversity Plots: a global diversity analysis of chemical libraries González-Medina, Mariana Prieto-Martínez, Fernando D. Owen, John R. Medina-Franco, José L. J Cheminform Methodology BACKGROUND: Measuring the structural diversity of compound databases is relevant in drug discovery and many other areas of chemistry. Since molecular diversity depends on molecular representation, comprehensive chemoinformatic analysis of the diversity of libraries uses multiple criteria. For instance, the diversity of the molecular libraries is typically evaluated employing molecular scaffolds, structural fingerprints, and physicochemical properties. However, the assessment with each criterion is analyzed independently and it is not straightforward to provide an evaluation of the “global diversity”. RESULTS: Herein the Consensus Diversity Plot (CDP) is proposed as a novel method to represent in low dimensions the diversity of chemical libraries considering simultaneously multiple molecular representations. We illustrate the application of CDPs to classify eight compound data sets and two subsets with different sizes and compositions using molecular scaffolds, structural fingerprints, and physicochemical properties. CONCLUSIONS: CDPs are general data mining tools that represent in two-dimensions the global diversity of compound data sets using multiple metrics. These plots can be constructed using single or combined measures of diversity. An online version of the CDPs is freely available at: https://consensusdiversityplots-difacquim-unam.shinyapps.io/RscriptsCDPlots/. [Figure: see text] ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-016-0176-9) contains supplementary material, which is available to authorized users. Springer International Publishing 2016-11-10 /pmc/articles/PMC5105260/ /pubmed/27895718 http://dx.doi.org/10.1186/s13321-016-0176-9 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
González-Medina, Mariana
Prieto-Martínez, Fernando D.
Owen, John R.
Medina-Franco, José L.
Consensus Diversity Plots: a global diversity analysis of chemical libraries
title Consensus Diversity Plots: a global diversity analysis of chemical libraries
title_full Consensus Diversity Plots: a global diversity analysis of chemical libraries
title_fullStr Consensus Diversity Plots: a global diversity analysis of chemical libraries
title_full_unstemmed Consensus Diversity Plots: a global diversity analysis of chemical libraries
title_short Consensus Diversity Plots: a global diversity analysis of chemical libraries
title_sort consensus diversity plots: a global diversity analysis of chemical libraries
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5105260/
https://www.ncbi.nlm.nih.gov/pubmed/27895718
http://dx.doi.org/10.1186/s13321-016-0176-9
work_keys_str_mv AT gonzalezmedinamariana consensusdiversityplotsaglobaldiversityanalysisofchemicallibraries
AT prietomartinezfernandod consensusdiversityplotsaglobaldiversityanalysisofchemicallibraries
AT owenjohnr consensusdiversityplotsaglobaldiversityanalysisofchemicallibraries
AT medinafrancojosel consensusdiversityplotsaglobaldiversityanalysisofchemicallibraries