Cargando…

A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification

Deep neural networks have been increasingly used in various chemical fields. In the nature of a data-driven approach, their performance strongly depends on data used in training. Therefore, models developed in data-deficient situations can cause highly uncertain predictions, leading to vulnerable de...

Descripción completa

Detalles Bibliográficos
Autores principales: Ryu, Seongok, Kwon, Yongchan, Kim, Woo Youn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Royal Society of Chemistry 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6839511/
https://www.ncbi.nlm.nih.gov/pubmed/31803423
http://dx.doi.org/10.1039/c9sc01992h
_version_ 1783467438504935424
author Ryu, Seongok
Kwon, Yongchan
Kim, Woo Youn
author_facet Ryu, Seongok
Kwon, Yongchan
Kim, Woo Youn
author_sort Ryu, Seongok
collection PubMed
description Deep neural networks have been increasingly used in various chemical fields. In the nature of a data-driven approach, their performance strongly depends on data used in training. Therefore, models developed in data-deficient situations can cause highly uncertain predictions, leading to vulnerable decision making. Here, we show that Bayesian inference enables more reliable prediction with quantitative uncertainty analysis. Decomposition of the predictive uncertainty into model- and data-driven uncertainties allows us to elucidate the source of errors for further improvements. For molecular applications, we devised a Bayesian graph convolutional network (GCN) and evaluated its performance for molecular property predictions. Our study on the classification problem of bio-activity and toxicity shows that the confidence of prediction can be quantified in terms of the predictive uncertainty, leading to more accurate virtual screening of drug candidates than standard GCNs. The result of log P prediction illustrates that data noise affects the data-driven uncertainty more significantly than the model-driven one. Based on this finding, we could identify artefacts that arose from quantum mechanical calculations in the Harvard Clean Energy Project dataset. Consequently, the Bayesian GCN is critical for molecular applications under data-deficient conditions.
format Online
Article
Text
id pubmed-6839511
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-68395112019-12-04 A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification Ryu, Seongok Kwon, Yongchan Kim, Woo Youn Chem Sci Chemistry Deep neural networks have been increasingly used in various chemical fields. In the nature of a data-driven approach, their performance strongly depends on data used in training. Therefore, models developed in data-deficient situations can cause highly uncertain predictions, leading to vulnerable decision making. Here, we show that Bayesian inference enables more reliable prediction with quantitative uncertainty analysis. Decomposition of the predictive uncertainty into model- and data-driven uncertainties allows us to elucidate the source of errors for further improvements. For molecular applications, we devised a Bayesian graph convolutional network (GCN) and evaluated its performance for molecular property predictions. Our study on the classification problem of bio-activity and toxicity shows that the confidence of prediction can be quantified in terms of the predictive uncertainty, leading to more accurate virtual screening of drug candidates than standard GCNs. The result of log P prediction illustrates that data noise affects the data-driven uncertainty more significantly than the model-driven one. Based on this finding, we could identify artefacts that arose from quantum mechanical calculations in the Harvard Clean Energy Project dataset. Consequently, the Bayesian GCN is critical for molecular applications under data-deficient conditions. Royal Society of Chemistry 2019-07-22 /pmc/articles/PMC6839511/ /pubmed/31803423 http://dx.doi.org/10.1039/c9sc01992h Text en This journal is © The Royal Society of Chemistry 2019 http://creativecommons.org/licenses/by-nc/3.0/ This article is freely available. This article is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported Licence (CC BY-NC 3.0)
spellingShingle Chemistry
Ryu, Seongok
Kwon, Yongchan
Kim, Woo Youn
A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification
title A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification
title_full A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification
title_fullStr A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification
title_full_unstemmed A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification
title_short A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification
title_sort bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6839511/
https://www.ncbi.nlm.nih.gov/pubmed/31803423
http://dx.doi.org/10.1039/c9sc01992h
work_keys_str_mv AT ryuseongok abayesiangraphconvolutionalnetworkforreliablepredictionofmolecularpropertieswithuncertaintyquantification
AT kwonyongchan abayesiangraphconvolutionalnetworkforreliablepredictionofmolecularpropertieswithuncertaintyquantification
AT kimwooyoun abayesiangraphconvolutionalnetworkforreliablepredictionofmolecularpropertieswithuncertaintyquantification
AT ryuseongok bayesiangraphconvolutionalnetworkforreliablepredictionofmolecularpropertieswithuncertaintyquantification
AT kwonyongchan bayesiangraphconvolutionalnetworkforreliablepredictionofmolecularpropertieswithuncertaintyquantification
AT kimwooyoun bayesiangraphconvolutionalnetworkforreliablepredictionofmolecularpropertieswithuncertaintyquantification