Cargando…

Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells

Breast cancer (BC) has surpassed lung cancer as the most frequently occurring cancer, and it is the leading cause of cancer-related death in women. Therefore, there is an urgent need to discover or design new drug candidates for BC treatment. In this study, we first collected a series of structurall...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Shuyun, Zhao, Duancheng, Ling, Yanle, Cai, Hanxuan, Cai, Yike, Zhang, Jiquan, Wang, Ling
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8719637/
https://www.ncbi.nlm.nih.gov/pubmed/34975493
http://dx.doi.org/10.3389/fphar.2021.796534
_version_ 1784624976618323968
author He, Shuyun
Zhao, Duancheng
Ling, Yanle
Cai, Hanxuan
Cai, Yike
Zhang, Jiquan
Wang, Ling
author_facet He, Shuyun
Zhao, Duancheng
Ling, Yanle
Cai, Hanxuan
Cai, Yike
Zhang, Jiquan
Wang, Ling
author_sort He, Shuyun
collection PubMed
description Breast cancer (BC) has surpassed lung cancer as the most frequently occurring cancer, and it is the leading cause of cancer-related death in women. Therefore, there is an urgent need to discover or design new drug candidates for BC treatment. In this study, we first collected a series of structurally diverse datasets consisting of 33,757 active and 21,152 inactive compounds for 13 breast cancer cell lines and one normal breast cell line commonly used in in vitro antiproliferative assays. Predictive models were then developed using five conventional machine learning algorithms, including naïve Bayesian, support vector machine, k-Nearest Neighbors, random forest, and extreme gradient boosting, as well as five deep learning algorithms, including deep neural networks, graph convolutional networks, graph attention network, message passing neural networks, and Attentive FP. A total of 476 single models and 112 fusion models were constructed based on three types of molecular representations including molecular descriptors, fingerprints, and graphs. The evaluation results demonstrate that the best model for each BC cell subtype can achieve high predictive accuracy for the test sets with AUC values of 0.689–0.993. Moreover, important structural fragments related to BC cell inhibition were identified and interpreted. To facilitate the use of the model, an online webserver called ChemBC (http://chembc.idruglab.cn/) and its local version software (https://github.com/idruglab/ChemBC) were developed to predict whether compounds have potential inhibitory activity against BC cells.
format Online
Article
Text
id pubmed-8719637
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-87196372022-01-01 Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells He, Shuyun Zhao, Duancheng Ling, Yanle Cai, Hanxuan Cai, Yike Zhang, Jiquan Wang, Ling Front Pharmacol Pharmacology Breast cancer (BC) has surpassed lung cancer as the most frequently occurring cancer, and it is the leading cause of cancer-related death in women. Therefore, there is an urgent need to discover or design new drug candidates for BC treatment. In this study, we first collected a series of structurally diverse datasets consisting of 33,757 active and 21,152 inactive compounds for 13 breast cancer cell lines and one normal breast cell line commonly used in in vitro antiproliferative assays. Predictive models were then developed using five conventional machine learning algorithms, including naïve Bayesian, support vector machine, k-Nearest Neighbors, random forest, and extreme gradient boosting, as well as five deep learning algorithms, including deep neural networks, graph convolutional networks, graph attention network, message passing neural networks, and Attentive FP. A total of 476 single models and 112 fusion models were constructed based on three types of molecular representations including molecular descriptors, fingerprints, and graphs. The evaluation results demonstrate that the best model for each BC cell subtype can achieve high predictive accuracy for the test sets with AUC values of 0.689–0.993. Moreover, important structural fragments related to BC cell inhibition were identified and interpreted. To facilitate the use of the model, an online webserver called ChemBC (http://chembc.idruglab.cn/) and its local version software (https://github.com/idruglab/ChemBC) were developed to predict whether compounds have potential inhibitory activity against BC cells. Frontiers Media S.A. 2021-12-17 /pmc/articles/PMC8719637/ /pubmed/34975493 http://dx.doi.org/10.3389/fphar.2021.796534 Text en Copyright © 2021 He, Zhao, Ling, Cai, Cai, Zhang and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pharmacology
He, Shuyun
Zhao, Duancheng
Ling, Yanle
Cai, Hanxuan
Cai, Yike
Zhang, Jiquan
Wang, Ling
Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells
title Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells
title_full Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells
title_fullStr Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells
title_full_unstemmed Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells
title_short Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells
title_sort machine learning enables accurate and rapid prediction of active molecules against breast cancer cells
topic Pharmacology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8719637/
https://www.ncbi.nlm.nih.gov/pubmed/34975493
http://dx.doi.org/10.3389/fphar.2021.796534
work_keys_str_mv AT heshuyun machinelearningenablesaccurateandrapidpredictionofactivemoleculesagainstbreastcancercells
AT zhaoduancheng machinelearningenablesaccurateandrapidpredictionofactivemoleculesagainstbreastcancercells
AT lingyanle machinelearningenablesaccurateandrapidpredictionofactivemoleculesagainstbreastcancercells
AT caihanxuan machinelearningenablesaccurateandrapidpredictionofactivemoleculesagainstbreastcancercells
AT caiyike machinelearningenablesaccurateandrapidpredictionofactivemoleculesagainstbreastcancercells
AT zhangjiquan machinelearningenablesaccurateandrapidpredictionofactivemoleculesagainstbreastcancercells
AT wangling machinelearningenablesaccurateandrapidpredictionofactivemoleculesagainstbreastcancercells