Cargando…

Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language

Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to accommodate...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Nathaniel H., Manica, Matteo, Born, Jannis, Hedrick, James L., Erdmann, Tim, Zubarev, Dmitry Yu., Adell-Mill, Nil, Arrechea, Pedro L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10284867/
https://www.ncbi.nlm.nih.gov/pubmed/37344485
http://dx.doi.org/10.1038/s41467-023-39396-3
_version_ 1785061486380449792
author Park, Nathaniel H.
Manica, Matteo
Born, Jannis
Hedrick, James L.
Erdmann, Tim
Zubarev, Dmitry Yu.
Adell-Mill, Nil
Arrechea, Pedro L.
author_facet Park, Nathaniel H.
Manica, Matteo
Born, Jannis
Hedrick, James L.
Erdmann, Tim
Zubarev, Dmitry Yu.
Adell-Mill, Nil
Arrechea, Pedro L.
author_sort Park, Nathaniel H.
collection PubMed
description Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to accommodate a broad array of experiment and data types found in polymer science. This inflexibility presents a significant barrier for researchers to leverage their historical data in ML development. Here we show that a domain specific language, termed Chemical Markdown Language (CMDL), provides flexible, extensible, and consistent representation of disparate experiment types and polymer structures. CMDL enables seamless use of historical experimental data to fine-tune regression transformer (RT) models for generative molecular design tasks. We demonstrate the utility of this approach through the generation and the experimental validation of catalysts and polymers in the context of ring-opening polymerization—although we provide examples of how CMDL can be more broadly applied to other polymer classes. Critically, we show how the CMDL tuned model preserves key functional groups within the polymer structure, allowing for experimental validation. These results reveal the versatility of CMDL and how it facilitates translation of historical data into meaningful predictive and generative models to produce experimentally actionable output.
format Online
Article
Text
id pubmed-10284867
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-102848672023-06-23 Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language Park, Nathaniel H. Manica, Matteo Born, Jannis Hedrick, James L. Erdmann, Tim Zubarev, Dmitry Yu. Adell-Mill, Nil Arrechea, Pedro L. Nat Commun Article Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to accommodate a broad array of experiment and data types found in polymer science. This inflexibility presents a significant barrier for researchers to leverage their historical data in ML development. Here we show that a domain specific language, termed Chemical Markdown Language (CMDL), provides flexible, extensible, and consistent representation of disparate experiment types and polymer structures. CMDL enables seamless use of historical experimental data to fine-tune regression transformer (RT) models for generative molecular design tasks. We demonstrate the utility of this approach through the generation and the experimental validation of catalysts and polymers in the context of ring-opening polymerization—although we provide examples of how CMDL can be more broadly applied to other polymer classes. Critically, we show how the CMDL tuned model preserves key functional groups within the polymer structure, allowing for experimental validation. These results reveal the versatility of CMDL and how it facilitates translation of historical data into meaningful predictive and generative models to produce experimentally actionable output. Nature Publishing Group UK 2023-06-21 /pmc/articles/PMC10284867/ /pubmed/37344485 http://dx.doi.org/10.1038/s41467-023-39396-3 Text en © The Author(s) 2023, corrected publication 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Park, Nathaniel H.
Manica, Matteo
Born, Jannis
Hedrick, James L.
Erdmann, Tim
Zubarev, Dmitry Yu.
Adell-Mill, Nil
Arrechea, Pedro L.
Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
title Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
title_full Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
title_fullStr Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
title_full_unstemmed Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
title_short Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
title_sort artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10284867/
https://www.ncbi.nlm.nih.gov/pubmed/37344485
http://dx.doi.org/10.1038/s41467-023-39396-3
work_keys_str_mv AT parknathanielh artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage
AT manicamatteo artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage
AT bornjannis artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage
AT hedrickjamesl artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage
AT erdmanntim artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage
AT zubarevdmitryyu artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage
AT adellmillnil artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage
AT arrecheapedrol artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage