Cargando…

Modular Software for Generating and Modeling Diverse Polymer Databases

[Image: see text] Machine learning methods offer the opportunity to design new functional materials on an unprecedented scale; however, building the large, diverse databases of molecules on which to train such methods remains a daunting task. Automated computational chemistry modeling workflows are...

Descripción completa

Detalles Bibliográficos
Autores principales: Santana-Bonilla, Alejandro, López-Ríos de Castro, Raquel, Sun, Peike, Ziolek, Robert M., Lorenz, Christian D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10302471/
https://www.ncbi.nlm.nih.gov/pubmed/37288782
http://dx.doi.org/10.1021/acs.jcim.3c00081
_version_ 1785065052583231488
author Santana-Bonilla, Alejandro
López-Ríos de Castro, Raquel
Sun, Peike
Ziolek, Robert M.
Lorenz, Christian D.
author_facet Santana-Bonilla, Alejandro
López-Ríos de Castro, Raquel
Sun, Peike
Ziolek, Robert M.
Lorenz, Christian D.
author_sort Santana-Bonilla, Alejandro
collection PubMed
description [Image: see text] Machine learning methods offer the opportunity to design new functional materials on an unprecedented scale; however, building the large, diverse databases of molecules on which to train such methods remains a daunting task. Automated computational chemistry modeling workflows are therefore becoming essential tools in this data-driven hunt for new materials with novel properties, since they offer a means by which to create and curate molecular databases without requiring significant levels of user input. This ensures that well-founded concerns regarding data provenance, reproducibility, and replicability are mitigated. We have developed a versatile and flexible software package, PySoftK (Python Soft Matter at King’s College London) that provides flexible, automated computational workflows to create, model, and curate libraries of polymers with minimal user intervention. PySoftK is available as an efficient, fully tested, and easily installable Python package. Key features of the software include the wide range of different polymer topologies that can be automatically generated and its fully parallelized library generation tools. It is anticipated that PySoftK will support the generation, modeling, and curation of large polymer libraries to support functional materials discovery in the nanotechnology and biotechnology arenas.
format Online
Article
Text
id pubmed-10302471
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-103024712023-06-29 Modular Software for Generating and Modeling Diverse Polymer Databases Santana-Bonilla, Alejandro López-Ríos de Castro, Raquel Sun, Peike Ziolek, Robert M. Lorenz, Christian D. J Chem Inf Model [Image: see text] Machine learning methods offer the opportunity to design new functional materials on an unprecedented scale; however, building the large, diverse databases of molecules on which to train such methods remains a daunting task. Automated computational chemistry modeling workflows are therefore becoming essential tools in this data-driven hunt for new materials with novel properties, since they offer a means by which to create and curate molecular databases without requiring significant levels of user input. This ensures that well-founded concerns regarding data provenance, reproducibility, and replicability are mitigated. We have developed a versatile and flexible software package, PySoftK (Python Soft Matter at King’s College London) that provides flexible, automated computational workflows to create, model, and curate libraries of polymers with minimal user intervention. PySoftK is available as an efficient, fully tested, and easily installable Python package. Key features of the software include the wide range of different polymer topologies that can be automatically generated and its fully parallelized library generation tools. It is anticipated that PySoftK will support the generation, modeling, and curation of large polymer libraries to support functional materials discovery in the nanotechnology and biotechnology arenas. American Chemical Society 2023-06-08 /pmc/articles/PMC10302471/ /pubmed/37288782 http://dx.doi.org/10.1021/acs.jcim.3c00081 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Santana-Bonilla, Alejandro
López-Ríos de Castro, Raquel
Sun, Peike
Ziolek, Robert M.
Lorenz, Christian D.
Modular Software for Generating and Modeling Diverse Polymer Databases
title Modular Software for Generating and Modeling Diverse Polymer Databases
title_full Modular Software for Generating and Modeling Diverse Polymer Databases
title_fullStr Modular Software for Generating and Modeling Diverse Polymer Databases
title_full_unstemmed Modular Software for Generating and Modeling Diverse Polymer Databases
title_short Modular Software for Generating and Modeling Diverse Polymer Databases
title_sort modular software for generating and modeling diverse polymer databases
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10302471/
https://www.ncbi.nlm.nih.gov/pubmed/37288782
http://dx.doi.org/10.1021/acs.jcim.3c00081
work_keys_str_mv AT santanabonillaalejandro modularsoftwareforgeneratingandmodelingdiversepolymerdatabases
AT lopezriosdecastroraquel modularsoftwareforgeneratingandmodelingdiversepolymerdatabases
AT sunpeike modularsoftwareforgeneratingandmodelingdiversepolymerdatabases
AT ziolekrobertm modularsoftwareforgeneratingandmodelingdiversepolymerdatabases
AT lorenzchristiand modularsoftwareforgeneratingandmodelingdiversepolymerdatabases