Cargando…

Network-based piecewise linear regression for QSAR modelling

Quantitative Structure-Activity Relationship (QSAR) models are critical in various areas of drug discovery, for example in lead optimisation and virtual screening. Recently, the need for models that are not only predictive but also interpretable has been highlighted. In this paper, a new methodology...

Descripción completa

Detalles Bibliográficos
Autores principales: Cardoso-Silva, Jonathan, Papageorgiou, Lazaros G., Tsoka, Sophia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6825651/
https://www.ncbi.nlm.nih.gov/pubmed/31628660
http://dx.doi.org/10.1007/s10822-019-00228-6
_version_ 1783464925161586688
author Cardoso-Silva, Jonathan
Papageorgiou, Lazaros G.
Tsoka, Sophia
author_facet Cardoso-Silva, Jonathan
Papageorgiou, Lazaros G.
Tsoka, Sophia
author_sort Cardoso-Silva, Jonathan
collection PubMed
description Quantitative Structure-Activity Relationship (QSAR) models are critical in various areas of drug discovery, for example in lead optimisation and virtual screening. Recently, the need for models that are not only predictive but also interpretable has been highlighted. In this paper, a new methodology is proposed to build interpretable QSAR models by combining elements of network analysis and piecewise linear regression. The algorithm presented, modSAR, splits data using a two-step procedure. First, compounds associated with a common target are represented as a network in terms of their structural similarity, revealing modules of similar chemical properties. Second, each module is subdivided into subsets (regions), each of which is modelled by an independent linear equation. Comparative analysis of QSAR models across five data sets of protein inhibitors obtained from ChEMBL is reported and it is shown that modSAR offers similar predictive accuracy to popular algorithms, such as Random Forest and Support Vector Machine. Moreover, we show that models built by modSAR are interpretatable, capable of evaluating the applicability domain of the compounds and serve well tasks such as virtual screening and the development of new drug leads. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10822-019-00228-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6825651
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-68256512019-11-05 Network-based piecewise linear regression for QSAR modelling Cardoso-Silva, Jonathan Papageorgiou, Lazaros G. Tsoka, Sophia J Comput Aided Mol Des Article Quantitative Structure-Activity Relationship (QSAR) models are critical in various areas of drug discovery, for example in lead optimisation and virtual screening. Recently, the need for models that are not only predictive but also interpretable has been highlighted. In this paper, a new methodology is proposed to build interpretable QSAR models by combining elements of network analysis and piecewise linear regression. The algorithm presented, modSAR, splits data using a two-step procedure. First, compounds associated with a common target are represented as a network in terms of their structural similarity, revealing modules of similar chemical properties. Second, each module is subdivided into subsets (regions), each of which is modelled by an independent linear equation. Comparative analysis of QSAR models across five data sets of protein inhibitors obtained from ChEMBL is reported and it is shown that modSAR offers similar predictive accuracy to popular algorithms, such as Random Forest and Support Vector Machine. Moreover, we show that models built by modSAR are interpretatable, capable of evaluating the applicability domain of the compounds and serve well tasks such as virtual screening and the development of new drug leads. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10822-019-00228-6) contains supplementary material, which is available to authorized users. Springer International Publishing 2019-10-18 2019 /pmc/articles/PMC6825651/ /pubmed/31628660 http://dx.doi.org/10.1007/s10822-019-00228-6 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Article
Cardoso-Silva, Jonathan
Papageorgiou, Lazaros G.
Tsoka, Sophia
Network-based piecewise linear regression for QSAR modelling
title Network-based piecewise linear regression for QSAR modelling
title_full Network-based piecewise linear regression for QSAR modelling
title_fullStr Network-based piecewise linear regression for QSAR modelling
title_full_unstemmed Network-based piecewise linear regression for QSAR modelling
title_short Network-based piecewise linear regression for QSAR modelling
title_sort network-based piecewise linear regression for qsar modelling
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6825651/
https://www.ncbi.nlm.nih.gov/pubmed/31628660
http://dx.doi.org/10.1007/s10822-019-00228-6
work_keys_str_mv AT cardososilvajonathan networkbasedpiecewiselinearregressionforqsarmodelling
AT papageorgioulazarosg networkbasedpiecewiselinearregressionforqsarmodelling
AT tsokasophia networkbasedpiecewiselinearregressionforqsarmodelling