Cargando…

Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds

[Image: see text] Sarcomas are a group of malignant neoplasms of connective tissue with a different etiology than carcinomas. The efforts to discover new drugs with antisarcoma activity have generated large datasets of multiple preclinical assays with different experimental conditions. For instance,...

Descripción completa

Detalles Bibliográficos
Autores principales: Cabrera-Andrade, Alejandro, López-Cortés, Andrés, Munteanu, Cristian R., Pazos, Alejandro, Pérez-Castillo, Yunierkis, Tejera, Eduardo, Arrasate, Sonia, González-Díaz, Humbert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2020
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7594149/
https://www.ncbi.nlm.nih.gov/pubmed/33134682
http://dx.doi.org/10.1021/acsomega.0c03356
_version_ 1783601568406306816
author Cabrera-Andrade, Alejandro
López-Cortés, Andrés
Munteanu, Cristian R.
Pazos, Alejandro
Pérez-Castillo, Yunierkis
Tejera, Eduardo
Arrasate, Sonia
González-Díaz, Humbert
author_facet Cabrera-Andrade, Alejandro
López-Cortés, Andrés
Munteanu, Cristian R.
Pazos, Alejandro
Pérez-Castillo, Yunierkis
Tejera, Eduardo
Arrasate, Sonia
González-Díaz, Humbert
author_sort Cabrera-Andrade, Alejandro
collection PubMed
description [Image: see text] Sarcomas are a group of malignant neoplasms of connective tissue with a different etiology than carcinomas. The efforts to discover new drugs with antisarcoma activity have generated large datasets of multiple preclinical assays with different experimental conditions. For instance, the ChEMBL database contains outcomes of 37,919 different antisarcoma assays with 34,955 different chemical compounds. Furthermore, the experimental conditions reported in this dataset include 157 types of biological activity parameters, 36 drug targets, 43 cell lines, and 17 assay organisms. Considering this information, we propose combining perturbation theory (PT) principles with machine learning (ML) to develop a PTML model to predict antisarcoma compounds. PTML models use one function of reference that measures the probability of a drug being active under certain conditions (protein, cell line, organism, etc.). In this paper, we used a linear discriminant analysis and neural network to train and compare PT and non-PT models. All the explored models have an accuracy of 89.19–95.25% for training and 89.22–95.46% in validation sets. PTML-based strategies have similar accuracy but generate simplest models. Therefore, they may become a versatile tool for predicting antisarcoma compounds.
format Online
Article
Text
id pubmed-7594149
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-75941492020-10-30 Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds Cabrera-Andrade, Alejandro López-Cortés, Andrés Munteanu, Cristian R. Pazos, Alejandro Pérez-Castillo, Yunierkis Tejera, Eduardo Arrasate, Sonia González-Díaz, Humbert ACS Omega [Image: see text] Sarcomas are a group of malignant neoplasms of connective tissue with a different etiology than carcinomas. The efforts to discover new drugs with antisarcoma activity have generated large datasets of multiple preclinical assays with different experimental conditions. For instance, the ChEMBL database contains outcomes of 37,919 different antisarcoma assays with 34,955 different chemical compounds. Furthermore, the experimental conditions reported in this dataset include 157 types of biological activity parameters, 36 drug targets, 43 cell lines, and 17 assay organisms. Considering this information, we propose combining perturbation theory (PT) principles with machine learning (ML) to develop a PTML model to predict antisarcoma compounds. PTML models use one function of reference that measures the probability of a drug being active under certain conditions (protein, cell line, organism, etc.). In this paper, we used a linear discriminant analysis and neural network to train and compare PT and non-PT models. All the explored models have an accuracy of 89.19–95.25% for training and 89.22–95.46% in validation sets. PTML-based strategies have similar accuracy but generate simplest models. Therefore, they may become a versatile tool for predicting antisarcoma compounds. American Chemical Society 2020-10-15 /pmc/articles/PMC7594149/ /pubmed/33134682 http://dx.doi.org/10.1021/acsomega.0c03356 Text en © 2020 American Chemical Society This is an open access article published under an ACS AuthorChoice License (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) , which permits copying and redistribution of the article or any adaptations for non-commercial purposes.
spellingShingle Cabrera-Andrade, Alejandro
López-Cortés, Andrés
Munteanu, Cristian R.
Pazos, Alejandro
Pérez-Castillo, Yunierkis
Tejera, Eduardo
Arrasate, Sonia
González-Díaz, Humbert
Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds
title Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds
title_full Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds
title_fullStr Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds
title_full_unstemmed Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds
title_short Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds
title_sort perturbation-theory machine learning (ptml) multilabel model of the chembl dataset of preclinical assays for antisarcoma compounds
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7594149/
https://www.ncbi.nlm.nih.gov/pubmed/33134682
http://dx.doi.org/10.1021/acsomega.0c03356
work_keys_str_mv AT cabreraandradealejandro perturbationtheorymachinelearningptmlmultilabelmodelofthechembldatasetofpreclinicalassaysforantisarcomacompounds
AT lopezcortesandres perturbationtheorymachinelearningptmlmultilabelmodelofthechembldatasetofpreclinicalassaysforantisarcomacompounds
AT munteanucristianr perturbationtheorymachinelearningptmlmultilabelmodelofthechembldatasetofpreclinicalassaysforantisarcomacompounds
AT pazosalejandro perturbationtheorymachinelearningptmlmultilabelmodelofthechembldatasetofpreclinicalassaysforantisarcomacompounds
AT perezcastilloyunierkis perturbationtheorymachinelearningptmlmultilabelmodelofthechembldatasetofpreclinicalassaysforantisarcomacompounds
AT tejeraeduardo perturbationtheorymachinelearningptmlmultilabelmodelofthechembldatasetofpreclinicalassaysforantisarcomacompounds
AT arrasatesonia perturbationtheorymachinelearningptmlmultilabelmodelofthechembldatasetofpreclinicalassaysforantisarcomacompounds
AT gonzalezdiazhumbert perturbationtheorymachinelearningptmlmultilabelmodelofthechembldatasetofpreclinicalassaysforantisarcomacompounds