Cargando…

Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network

As safety is one of the most important properties of drugs, chemical toxicology prediction has received increasing attentions in the drug discovery research. Traditionally, researchers rely on in vitro and in vivo experiments to test the toxicity of chemical compounds. However, not only are these ex...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chen, Jiarui, Si, Yain-Whar, Un, Chon-Wai, Siu, Shirley W. I.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2021
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8627024/ https://www.ncbi.nlm.nih.gov/pubmed/34838140 http://dx.doi.org/10.1186/s13321-021-00570-8

_version_	1784606773509881856
author	Chen, Jiarui Si, Yain-Whar Un, Chon-Wai Siu, Shirley W. I.
author_facet	Chen, Jiarui Si, Yain-Whar Un, Chon-Wai Siu, Shirley W. I.
author_sort	Chen, Jiarui
collection	PubMed
description	As safety is one of the most important properties of drugs, chemical toxicology prediction has received increasing attentions in the drug discovery research. Traditionally, researchers rely on in vitro and in vivo experiments to test the toxicity of chemical compounds. However, not only are these experiments time consuming and costly, but experiments that involve animal testing are increasingly subject to ethical concerns. While traditional machine learning (ML) methods have been used in the field with some success, the limited availability of annotated toxicity data is the major hurdle for further improving model performance. Inspired by the success of semi-supervised learning (SSL) algorithms, we propose a Graph Convolution Neural Network (GCN) to predict chemical toxicity and trained the network by the Mean Teacher (MT) SSL algorithm. Using the Tox21 data, our optimal SSL-GCN models for predicting the twelve toxicological endpoints achieve an average ROC-AUC score of 0.757 in the test set, which is a 6% improvement over GCN models trained by supervised learning and conventional ML methods. Our SSL-GCN models also exhibit superior performance when compared to models constructed using the built-in DeepChem ML methods. This study demonstrates that SSL can increase the prediction power of models by learning from unannotated data. The optimal unannotated to annotated data ratio ranges between 1:1 and 4:1. This study demonstrates the success of SSL in chemical toxicity prediction; the same technique is expected to be beneficial to other chemical property prediction tasks by utilizing existing large chemical databases. Our optimal model SSL-GCN is hosted on an online server accessible through: https://app.cbbio.online/ssl-gcn/home. SUPPLEMENTARY INFORMATION: Supplementary information accompanies this paper at 10.1186/s13321-021-00570-8.
format	Online Article Text
id	pubmed-8627024
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-86270242021-11-30 Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network Chen, Jiarui Si, Yain-Whar Un, Chon-Wai Siu, Shirley W. I. J Cheminform Methodology As safety is one of the most important properties of drugs, chemical toxicology prediction has received increasing attentions in the drug discovery research. Traditionally, researchers rely on in vitro and in vivo experiments to test the toxicity of chemical compounds. However, not only are these experiments time consuming and costly, but experiments that involve animal testing are increasingly subject to ethical concerns. While traditional machine learning (ML) methods have been used in the field with some success, the limited availability of annotated toxicity data is the major hurdle for further improving model performance. Inspired by the success of semi-supervised learning (SSL) algorithms, we propose a Graph Convolution Neural Network (GCN) to predict chemical toxicity and trained the network by the Mean Teacher (MT) SSL algorithm. Using the Tox21 data, our optimal SSL-GCN models for predicting the twelve toxicological endpoints achieve an average ROC-AUC score of 0.757 in the test set, which is a 6% improvement over GCN models trained by supervised learning and conventional ML methods. Our SSL-GCN models also exhibit superior performance when compared to models constructed using the built-in DeepChem ML methods. This study demonstrates that SSL can increase the prediction power of models by learning from unannotated data. The optimal unannotated to annotated data ratio ranges between 1:1 and 4:1. This study demonstrates the success of SSL in chemical toxicity prediction; the same technique is expected to be beneficial to other chemical property prediction tasks by utilizing existing large chemical databases. Our optimal model SSL-GCN is hosted on an online server accessible through: https://app.cbbio.online/ssl-gcn/home. SUPPLEMENTARY INFORMATION: Supplementary information accompanies this paper at 10.1186/s13321-021-00570-8. Springer International Publishing 2021-11-27 /pmc/articles/PMC8627024/ /pubmed/34838140 http://dx.doi.org/10.1186/s13321-021-00570-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Methodology Chen, Jiarui Si, Yain-Whar Un, Chon-Wai Siu, Shirley W. I. Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title	Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_full	Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_fullStr	Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_full_unstemmed	Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_short	Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_sort	chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8627024/ https://www.ncbi.nlm.nih.gov/pubmed/34838140 http://dx.doi.org/10.1186/s13321-021-00570-8
work_keys_str_mv	AT chenjiarui chemicaltoxicitypredictionbasedonsemisupervisedlearningandgraphconvolutionalneuralnetwork AT siyainwhar chemicaltoxicitypredictionbasedonsemisupervisedlearningandgraphconvolutionalneuralnetwork AT unchonwai chemicaltoxicitypredictionbasedonsemisupervisedlearningandgraphconvolutionalneuralnetwork AT siushirleywi chemicaltoxicitypredictionbasedonsemisupervisedlearningandgraphconvolutionalneuralnetwork

Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network

Ejemplares similares