Cargando…

Polygrammar: Grammar for Digital Polymer Representation and Generation

Polymers are widely studied materials with diverse properties and applications determined by molecular structures. It is essential to represent these structures clearly and explore the full space of achievable chemical designs. However, existing approaches cannot offer comprehensive design models fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Minghao, Shou, Wan, Makatura, Liane, Erps, Timothy, Foshey, Michael, Matusik, Wojciech
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9376847/
https://www.ncbi.nlm.nih.gov/pubmed/35678650
http://dx.doi.org/10.1002/advs.202101864
_version_ 1784768220788424704
author Guo, Minghao
Shou, Wan
Makatura, Liane
Erps, Timothy
Foshey, Michael
Matusik, Wojciech
author_facet Guo, Minghao
Shou, Wan
Makatura, Liane
Erps, Timothy
Foshey, Michael
Matusik, Wojciech
author_sort Guo, Minghao
collection PubMed
description Polymers are widely studied materials with diverse properties and applications determined by molecular structures. It is essential to represent these structures clearly and explore the full space of achievable chemical designs. However, existing approaches cannot offer comprehensive design models for polymers because of their inherent scale and structural complexity. Here, a parametric, context‐sensitive grammar designed specifically for polymers (PolyGrammar) is proposed. Using the symbolic hypergraph representation and 14 simple production rules, PolyGrammar can represent and generate all valid polyurethane structures. An algorithm is presented to translate any polyurethane structure from the popular Simplified Molecular‐Input Line‐entry System (SMILES) string format into the PolyGrammar representation. The representative power of PolyGrammar is tested by translating a dataset of over 600 polyurethane samples collected from the literature. Furthermore, it is shown that PolyGrammar can be easily extended to other copolymers and homopolymers. By offering a complete, explicit representation scheme and an explainable generative model with validity guarantees, PolyGrammar takes an essential step toward a more comprehensive and practical system for polymer discovery and exploration. As the first bridge between formal languages and chemistry, PolyGrammar also serves as a critical blueprint to inform the design of similar grammars for other chemistries, including organic and inorganic molecules.
format Online
Article
Text
id pubmed-9376847
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-93768472022-08-18 Polygrammar: Grammar for Digital Polymer Representation and Generation Guo, Minghao Shou, Wan Makatura, Liane Erps, Timothy Foshey, Michael Matusik, Wojciech Adv Sci (Weinh) Research Articles Polymers are widely studied materials with diverse properties and applications determined by molecular structures. It is essential to represent these structures clearly and explore the full space of achievable chemical designs. However, existing approaches cannot offer comprehensive design models for polymers because of their inherent scale and structural complexity. Here, a parametric, context‐sensitive grammar designed specifically for polymers (PolyGrammar) is proposed. Using the symbolic hypergraph representation and 14 simple production rules, PolyGrammar can represent and generate all valid polyurethane structures. An algorithm is presented to translate any polyurethane structure from the popular Simplified Molecular‐Input Line‐entry System (SMILES) string format into the PolyGrammar representation. The representative power of PolyGrammar is tested by translating a dataset of over 600 polyurethane samples collected from the literature. Furthermore, it is shown that PolyGrammar can be easily extended to other copolymers and homopolymers. By offering a complete, explicit representation scheme and an explainable generative model with validity guarantees, PolyGrammar takes an essential step toward a more comprehensive and practical system for polymer discovery and exploration. As the first bridge between formal languages and chemistry, PolyGrammar also serves as a critical blueprint to inform the design of similar grammars for other chemistries, including organic and inorganic molecules. John Wiley and Sons Inc. 2022-06-09 /pmc/articles/PMC9376847/ /pubmed/35678650 http://dx.doi.org/10.1002/advs.202101864 Text en © 2022 The Authors. Advanced Science published by Wiley‐VCH GmbH https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Guo, Minghao
Shou, Wan
Makatura, Liane
Erps, Timothy
Foshey, Michael
Matusik, Wojciech
Polygrammar: Grammar for Digital Polymer Representation and Generation
title Polygrammar: Grammar for Digital Polymer Representation and Generation
title_full Polygrammar: Grammar for Digital Polymer Representation and Generation
title_fullStr Polygrammar: Grammar for Digital Polymer Representation and Generation
title_full_unstemmed Polygrammar: Grammar for Digital Polymer Representation and Generation
title_short Polygrammar: Grammar for Digital Polymer Representation and Generation
title_sort polygrammar: grammar for digital polymer representation and generation
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9376847/
https://www.ncbi.nlm.nih.gov/pubmed/35678650
http://dx.doi.org/10.1002/advs.202101864
work_keys_str_mv AT guominghao polygrammargrammarfordigitalpolymerrepresentationandgeneration
AT shouwan polygrammargrammarfordigitalpolymerrepresentationandgeneration
AT makaturaliane polygrammargrammarfordigitalpolymerrepresentationandgeneration
AT erpstimothy polygrammargrammarfordigitalpolymerrepresentationandgeneration
AT fosheymichael polygrammargrammarfordigitalpolymerrepresentationandgeneration
AT matusikwojciech polygrammargrammarfordigitalpolymerrepresentationandgeneration