Cargando…

A graph representation of molecular ensembles for polymer property prediction

Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction and virtual screening can accelerate polymer design by prioritizing candidates expected to have fav...

Descripción completa

Detalles Bibliográficos
Autores principales: Aldeghi, Matteo, Coley, Connor W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9473492/
https://www.ncbi.nlm.nih.gov/pubmed/36277616
http://dx.doi.org/10.1039/d2sc02839e
_version_ 1784789514171973632
author Aldeghi, Matteo
Coley, Connor W.
author_facet Aldeghi, Matteo
Coley, Connor W.
author_sort Aldeghi, Matteo
collection PubMed
description Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction and virtual screening can accelerate polymer design by prioritizing candidates expected to have favorable properties. However, in contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules, which poses unique challenges to traditional chemical representations and machine learning approaches. Here, we introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction. We demonstrate that this approach captures critical features of polymeric materials, like chain architecture, monomer stoichiometry, and degree of polymerization, and achieves superior accuracy to off-the-shelf cheminformatics methodologies. While doing so, we built a dataset of simulated electron affinity and ionization potential values for >40k polymers with varying monomer composition, stoichiometry, and chain architecture, which may be used in the development of other tailored machine learning approaches. The dataset and machine learning models presented in this work pave the path toward new classes of algorithms for polymer informatics and, more broadly, introduce a framework for the modeling of molecular ensembles.
format Online
Article
Text
id pubmed-9473492
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-94734922022-10-20 A graph representation of molecular ensembles for polymer property prediction Aldeghi, Matteo Coley, Connor W. Chem Sci Chemistry Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction and virtual screening can accelerate polymer design by prioritizing candidates expected to have favorable properties. However, in contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules, which poses unique challenges to traditional chemical representations and machine learning approaches. Here, we introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction. We demonstrate that this approach captures critical features of polymeric materials, like chain architecture, monomer stoichiometry, and degree of polymerization, and achieves superior accuracy to off-the-shelf cheminformatics methodologies. While doing so, we built a dataset of simulated electron affinity and ionization potential values for >40k polymers with varying monomer composition, stoichiometry, and chain architecture, which may be used in the development of other tailored machine learning approaches. The dataset and machine learning models presented in this work pave the path toward new classes of algorithms for polymer informatics and, more broadly, introduce a framework for the modeling of molecular ensembles. The Royal Society of Chemistry 2022-08-25 /pmc/articles/PMC9473492/ /pubmed/36277616 http://dx.doi.org/10.1039/d2sc02839e Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/
spellingShingle Chemistry
Aldeghi, Matteo
Coley, Connor W.
A graph representation of molecular ensembles for polymer property prediction
title A graph representation of molecular ensembles for polymer property prediction
title_full A graph representation of molecular ensembles for polymer property prediction
title_fullStr A graph representation of molecular ensembles for polymer property prediction
title_full_unstemmed A graph representation of molecular ensembles for polymer property prediction
title_short A graph representation of molecular ensembles for polymer property prediction
title_sort graph representation of molecular ensembles for polymer property prediction
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9473492/
https://www.ncbi.nlm.nih.gov/pubmed/36277616
http://dx.doi.org/10.1039/d2sc02839e
work_keys_str_mv AT aldeghimatteo agraphrepresentationofmolecularensemblesforpolymerpropertyprediction
AT coleyconnorw agraphrepresentationofmolecularensemblesforpolymerpropertyprediction
AT aldeghimatteo graphrepresentationofmolecularensemblesforpolymerpropertyprediction
AT coleyconnorw graphrepresentationofmolecularensemblesforpolymerpropertyprediction