Cargando…

Badapple: promiscuity patterns from noisy evidence

BACKGROUND: Bioassay data analysis continues to be an essential, routine, yet challenging task in modern drug discovery and chemical biology research. The challenge is to infer reliable knowledge from big and noisy data. Some aspects of this problem are general with solutions informed by existing an...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Jeremy J., Ursu, Oleg, Lipinski, Christopher A., Sklar, Larry A., Oprea, Tudor I., Bologa, Cristian G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4884375/
https://www.ncbi.nlm.nih.gov/pubmed/27239230
http://dx.doi.org/10.1186/s13321-016-0137-3
_version_ 1782434363728723968
author Yang, Jeremy J.
Ursu, Oleg
Lipinski, Christopher A.
Sklar, Larry A.
Oprea, Tudor I.
Bologa, Cristian G.
author_facet Yang, Jeremy J.
Ursu, Oleg
Lipinski, Christopher A.
Sklar, Larry A.
Oprea, Tudor I.
Bologa, Cristian G.
author_sort Yang, Jeremy J.
collection PubMed
description BACKGROUND: Bioassay data analysis continues to be an essential, routine, yet challenging task in modern drug discovery and chemical biology research. The challenge is to infer reliable knowledge from big and noisy data. Some aspects of this problem are general with solutions informed by existing and emerging data science best practices. Some aspects are domain specific, and rely on expertise in bioassay methodology and chemical biology. Testing compounds for biological activity requires complex and innovative methodology, producing results varying widely in accuracy, precision, and information content. Hit selection criteria involve optimizing such that the overall probability of success in a project is maximized, and resource-wasteful “false trails” are avoided. This “fail-early” approach is embraced both in pharmaceutical and academic drug discovery, since follow-up capacity is resource-limited. Thus, early identification of likely promiscuous compounds has practical value. RESULTS: Here we describe an algorithm for identifying likely promiscuous compounds via associated scaffolds which combines general and domain-specific features to assist and accelerate drug discovery informatics, called Badapple: bioassay-data associative promiscuity pattern learning engine. Results are described from an analysis using data from MLP assays via the BioAssay Research Database (BARD) http://bard.nih.gov. Specific examples are analyzed in the context of medicinal chemistry, to illustrate associations with mechanisms of promiscuity. Badapple has been developed at UNM, released and deployed for public use two ways: (1) BARD plugin, integrated into the public BARD REST API and BARD web client; and (2) public web app hosted at UNM. CONCLUSIONS: Badapple is a method for rapidly identifying likely promiscuous compounds via associated scaffolds. Badapple generates a score associated with a pragmatic, empirical definition of promiscuity, with the overall goal to identify “false trails” and streamline workflows. Unlike methods reliant on expert curation of chemical substructure patterns, Badapple is fully evidence-driven, automated, self-improving via integration of additional data, and focused on scaffolds. Badapple is robust with respect to noise and errors, and skeptical of scanty evidence. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-016-0137-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4884375
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-48843752016-05-29 Badapple: promiscuity patterns from noisy evidence Yang, Jeremy J. Ursu, Oleg Lipinski, Christopher A. Sklar, Larry A. Oprea, Tudor I. Bologa, Cristian G. J Cheminform Research Article BACKGROUND: Bioassay data analysis continues to be an essential, routine, yet challenging task in modern drug discovery and chemical biology research. The challenge is to infer reliable knowledge from big and noisy data. Some aspects of this problem are general with solutions informed by existing and emerging data science best practices. Some aspects are domain specific, and rely on expertise in bioassay methodology and chemical biology. Testing compounds for biological activity requires complex and innovative methodology, producing results varying widely in accuracy, precision, and information content. Hit selection criteria involve optimizing such that the overall probability of success in a project is maximized, and resource-wasteful “false trails” are avoided. This “fail-early” approach is embraced both in pharmaceutical and academic drug discovery, since follow-up capacity is resource-limited. Thus, early identification of likely promiscuous compounds has practical value. RESULTS: Here we describe an algorithm for identifying likely promiscuous compounds via associated scaffolds which combines general and domain-specific features to assist and accelerate drug discovery informatics, called Badapple: bioassay-data associative promiscuity pattern learning engine. Results are described from an analysis using data from MLP assays via the BioAssay Research Database (BARD) http://bard.nih.gov. Specific examples are analyzed in the context of medicinal chemistry, to illustrate associations with mechanisms of promiscuity. Badapple has been developed at UNM, released and deployed for public use two ways: (1) BARD plugin, integrated into the public BARD REST API and BARD web client; and (2) public web app hosted at UNM. CONCLUSIONS: Badapple is a method for rapidly identifying likely promiscuous compounds via associated scaffolds. Badapple generates a score associated with a pragmatic, empirical definition of promiscuity, with the overall goal to identify “false trails” and streamline workflows. Unlike methods reliant on expert curation of chemical substructure patterns, Badapple is fully evidence-driven, automated, self-improving via integration of additional data, and focused on scaffolds. Badapple is robust with respect to noise and errors, and skeptical of scanty evidence. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-016-0137-3) contains supplementary material, which is available to authorized users. Springer International Publishing 2016-05-28 /pmc/articles/PMC4884375/ /pubmed/27239230 http://dx.doi.org/10.1186/s13321-016-0137-3 Text en © Yang et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Yang, Jeremy J.
Ursu, Oleg
Lipinski, Christopher A.
Sklar, Larry A.
Oprea, Tudor I.
Bologa, Cristian G.
Badapple: promiscuity patterns from noisy evidence
title Badapple: promiscuity patterns from noisy evidence
title_full Badapple: promiscuity patterns from noisy evidence
title_fullStr Badapple: promiscuity patterns from noisy evidence
title_full_unstemmed Badapple: promiscuity patterns from noisy evidence
title_short Badapple: promiscuity patterns from noisy evidence
title_sort badapple: promiscuity patterns from noisy evidence
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4884375/
https://www.ncbi.nlm.nih.gov/pubmed/27239230
http://dx.doi.org/10.1186/s13321-016-0137-3
work_keys_str_mv AT yangjeremyj badapplepromiscuitypatternsfromnoisyevidence
AT ursuoleg badapplepromiscuitypatternsfromnoisyevidence
AT lipinskichristophera badapplepromiscuitypatternsfromnoisyevidence
AT sklarlarrya badapplepromiscuitypatternsfromnoisyevidence
AT opreatudori badapplepromiscuitypatternsfromnoisyevidence
AT bologacristiang badapplepromiscuitypatternsfromnoisyevidence