Cargando…

A graph neural network approach for molecule carcinogenicity prediction

MOTIVATION: Molecular carcinogenicity is a preventable cause of cancer, but systematically identifying carcinogenic compounds, which involves performing experiments on animal models, is expensive, time consuming and low throughput. As a result, carcinogenicity information is limited and building dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Fradkin, Philip, Young, Adamo, Atanackovic, Lazar, Frey, Brendan, Lee, Leo J, Wang, Bo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235510/
https://www.ncbi.nlm.nih.gov/pubmed/35758812
http://dx.doi.org/10.1093/bioinformatics/btac266
_version_ 1784736327581827072
author Fradkin, Philip
Young, Adamo
Atanackovic, Lazar
Frey, Brendan
Lee, Leo J
Wang, Bo
author_facet Fradkin, Philip
Young, Adamo
Atanackovic, Lazar
Frey, Brendan
Lee, Leo J
Wang, Bo
author_sort Fradkin, Philip
collection PubMed
description MOTIVATION: Molecular carcinogenicity is a preventable cause of cancer, but systematically identifying carcinogenic compounds, which involves performing experiments on animal models, is expensive, time consuming and low throughput. As a result, carcinogenicity information is limited and building data-driven models with good prediction accuracy remains a major challenge. RESULTS: In this work, we propose CONCERTO, a deep learning model that uses a graph transformer in conjunction with a molecular fingerprint representation for carcinogenicity prediction from molecular structure. Special efforts have been made to overcome the data size constraint, such as multi-round pre-training on related but lower quality mutagenicity data, and transfer learning from a large self-supervised model. Extensive experiments demonstrate that our model performs well and can generalize to external validation sets. CONCERTO could be useful for guiding future carcinogenicity experiments and provide insight into the molecular basis of carcinogenicity. AVAILABILITY AND IMPLEMENTATION: The code and data underlying this article are available on github at https://github.com/bowang-lab/CONCERTO
format Online
Article
Text
id pubmed-9235510
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92355102022-06-29 A graph neural network approach for molecule carcinogenicity prediction Fradkin, Philip Young, Adamo Atanackovic, Lazar Frey, Brendan Lee, Leo J Wang, Bo Bioinformatics ISCB/Ismb 2022 MOTIVATION: Molecular carcinogenicity is a preventable cause of cancer, but systematically identifying carcinogenic compounds, which involves performing experiments on animal models, is expensive, time consuming and low throughput. As a result, carcinogenicity information is limited and building data-driven models with good prediction accuracy remains a major challenge. RESULTS: In this work, we propose CONCERTO, a deep learning model that uses a graph transformer in conjunction with a molecular fingerprint representation for carcinogenicity prediction from molecular structure. Special efforts have been made to overcome the data size constraint, such as multi-round pre-training on related but lower quality mutagenicity data, and transfer learning from a large self-supervised model. Extensive experiments demonstrate that our model performs well and can generalize to external validation sets. CONCERTO could be useful for guiding future carcinogenicity experiments and provide insight into the molecular basis of carcinogenicity. AVAILABILITY AND IMPLEMENTATION: The code and data underlying this article are available on github at https://github.com/bowang-lab/CONCERTO Oxford University Press 2022-06-27 /pmc/articles/PMC9235510/ /pubmed/35758812 http://dx.doi.org/10.1093/bioinformatics/btac266 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle ISCB/Ismb 2022
Fradkin, Philip
Young, Adamo
Atanackovic, Lazar
Frey, Brendan
Lee, Leo J
Wang, Bo
A graph neural network approach for molecule carcinogenicity prediction
title A graph neural network approach for molecule carcinogenicity prediction
title_full A graph neural network approach for molecule carcinogenicity prediction
title_fullStr A graph neural network approach for molecule carcinogenicity prediction
title_full_unstemmed A graph neural network approach for molecule carcinogenicity prediction
title_short A graph neural network approach for molecule carcinogenicity prediction
title_sort graph neural network approach for molecule carcinogenicity prediction
topic ISCB/Ismb 2022
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235510/
https://www.ncbi.nlm.nih.gov/pubmed/35758812
http://dx.doi.org/10.1093/bioinformatics/btac266
work_keys_str_mv AT fradkinphilip agraphneuralnetworkapproachformoleculecarcinogenicityprediction
AT youngadamo agraphneuralnetworkapproachformoleculecarcinogenicityprediction
AT atanackoviclazar agraphneuralnetworkapproachformoleculecarcinogenicityprediction
AT freybrendan agraphneuralnetworkapproachformoleculecarcinogenicityprediction
AT leeleoj agraphneuralnetworkapproachformoleculecarcinogenicityprediction
AT wangbo agraphneuralnetworkapproachformoleculecarcinogenicityprediction
AT fradkinphilip graphneuralnetworkapproachformoleculecarcinogenicityprediction
AT youngadamo graphneuralnetworkapproachformoleculecarcinogenicityprediction
AT atanackoviclazar graphneuralnetworkapproachformoleculecarcinogenicityprediction
AT freybrendan graphneuralnetworkapproachformoleculecarcinogenicityprediction
AT leeleoj graphneuralnetworkapproachformoleculecarcinogenicityprediction
AT wangbo graphneuralnetworkapproachformoleculecarcinogenicityprediction