Cargando…

Maximizing gain in high-throughput screening using conformal prediction

Iterative screening has emerged as a promising approach to increase the efficiency of screening campaigns compared to traditional high throughput approaches. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models, resulting in...

Descripción completa

Detalles Bibliográficos
Autores principales: Svensson, Fredrik, Afzal, Avid M., Norinder, Ulf, Bender, Andreas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5821614/
https://www.ncbi.nlm.nih.gov/pubmed/29468427
http://dx.doi.org/10.1186/s13321-018-0260-4
_version_ 1783301530713063424
author Svensson, Fredrik
Afzal, Avid M.
Norinder, Ulf
Bender, Andreas
author_facet Svensson, Fredrik
Afzal, Avid M.
Norinder, Ulf
Bender, Andreas
author_sort Svensson, Fredrik
collection PubMed
description Iterative screening has emerged as a promising approach to increase the efficiency of screening campaigns compared to traditional high throughput approaches. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models, resulting in more efficient screening. One way to evaluate screening is to consider the cost of screening compared to the gain associated with finding an active compound. In this work, we introduce a conformal predictor coupled with a gain-cost function with the aim to maximise gain in iterative screening. Using this setup we were able to show that by evaluating the predictions on the training data, very accurate predictions on what settings will produce the highest gain on the test data can be made. We evaluate the approach on 12 bioactivity datasets from PubChem training the models using 20% of the data. Depending on the settings of the gain-cost function, the settings generating the maximum gain were accurately identified in 8–10 out of the 12 datasets. Broadly, our approach can predict what strategy generates the highest gain based on the results of the cost-gain evaluation: to screen the compounds predicted to be active, to screen all the remaining data, or not to screen any additional compounds. When the algorithm indicates that the predicted active compounds should be screened, our approach also indicates what confidence level to apply in order to maximize gain. Hence, our approach facilitates decision-making and allocation of the resources where they deliver the most value by indicating in advance the likely outcome of a screening campaign. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13321-018-0260-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5821614
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-58216142018-02-27 Maximizing gain in high-throughput screening using conformal prediction Svensson, Fredrik Afzal, Avid M. Norinder, Ulf Bender, Andreas J Cheminform Research Article Iterative screening has emerged as a promising approach to increase the efficiency of screening campaigns compared to traditional high throughput approaches. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models, resulting in more efficient screening. One way to evaluate screening is to consider the cost of screening compared to the gain associated with finding an active compound. In this work, we introduce a conformal predictor coupled with a gain-cost function with the aim to maximise gain in iterative screening. Using this setup we were able to show that by evaluating the predictions on the training data, very accurate predictions on what settings will produce the highest gain on the test data can be made. We evaluate the approach on 12 bioactivity datasets from PubChem training the models using 20% of the data. Depending on the settings of the gain-cost function, the settings generating the maximum gain were accurately identified in 8–10 out of the 12 datasets. Broadly, our approach can predict what strategy generates the highest gain based on the results of the cost-gain evaluation: to screen the compounds predicted to be active, to screen all the remaining data, or not to screen any additional compounds. When the algorithm indicates that the predicted active compounds should be screened, our approach also indicates what confidence level to apply in order to maximize gain. Hence, our approach facilitates decision-making and allocation of the resources where they deliver the most value by indicating in advance the likely outcome of a screening campaign. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13321-018-0260-4) contains supplementary material, which is available to authorized users. Springer International Publishing 2018-02-21 /pmc/articles/PMC5821614/ /pubmed/29468427 http://dx.doi.org/10.1186/s13321-018-0260-4 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Svensson, Fredrik
Afzal, Avid M.
Norinder, Ulf
Bender, Andreas
Maximizing gain in high-throughput screening using conformal prediction
title Maximizing gain in high-throughput screening using conformal prediction
title_full Maximizing gain in high-throughput screening using conformal prediction
title_fullStr Maximizing gain in high-throughput screening using conformal prediction
title_full_unstemmed Maximizing gain in high-throughput screening using conformal prediction
title_short Maximizing gain in high-throughput screening using conformal prediction
title_sort maximizing gain in high-throughput screening using conformal prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5821614/
https://www.ncbi.nlm.nih.gov/pubmed/29468427
http://dx.doi.org/10.1186/s13321-018-0260-4
work_keys_str_mv AT svenssonfredrik maximizinggaininhighthroughputscreeningusingconformalprediction
AT afzalavidm maximizinggaininhighthroughputscreeningusingconformalprediction
AT norinderulf maximizinggaininhighthroughputscreeningusingconformalprediction
AT benderandreas maximizinggaininhighthroughputscreeningusingconformalprediction