Cargando…
Persistent Dirac for molecular representation
Molecular representations are of fundamental importance for the modeling and analysing molecular systems. The successes in drug design and materials discovery have been greatly contributed by molecular representation models. In this paper, we present a computational framework for molecular represent...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10336089/ https://www.ncbi.nlm.nih.gov/pubmed/37433870 http://dx.doi.org/10.1038/s41598-023-37853-z |
_version_ | 1785071133739974656 |
---|---|
author | Wee, Junjie Bianconi, Ginestra Xia, Kelin |
author_facet | Wee, Junjie Bianconi, Ginestra Xia, Kelin |
author_sort | Wee, Junjie |
collection | PubMed |
description | Molecular representations are of fundamental importance for the modeling and analysing molecular systems. The successes in drug design and materials discovery have been greatly contributed by molecular representation models. In this paper, we present a computational framework for molecular representation that is mathematically rigorous and based on the persistent Dirac operator. The properties of the discrete weighted and unweighted Dirac matrix are systematically discussed, and the biological meanings of both homological and non-homological eigenvectors are studied. We also evaluate the impact of various weighting schemes on the weighted Dirac matrix. Additionally, a set of physical persistent attributes that characterize the persistence and variation of spectrum properties of Dirac matrices during a filtration process is proposed to be molecular fingerprints. Our persistent attributes are used to classify molecular configurations of nine different types of organic-inorganic halide perovskites. The combination of persistent attributes with gradient boosting tree model has achieved great success in molecular solvation free energy prediction. The results show that our model is effective in characterizing the molecular structures, demonstrating the power of our molecular representation and featurization approach. |
format | Online Article Text |
id | pubmed-10336089 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-103360892023-07-13 Persistent Dirac for molecular representation Wee, Junjie Bianconi, Ginestra Xia, Kelin Sci Rep Article Molecular representations are of fundamental importance for the modeling and analysing molecular systems. The successes in drug design and materials discovery have been greatly contributed by molecular representation models. In this paper, we present a computational framework for molecular representation that is mathematically rigorous and based on the persistent Dirac operator. The properties of the discrete weighted and unweighted Dirac matrix are systematically discussed, and the biological meanings of both homological and non-homological eigenvectors are studied. We also evaluate the impact of various weighting schemes on the weighted Dirac matrix. Additionally, a set of physical persistent attributes that characterize the persistence and variation of spectrum properties of Dirac matrices during a filtration process is proposed to be molecular fingerprints. Our persistent attributes are used to classify molecular configurations of nine different types of organic-inorganic halide perovskites. The combination of persistent attributes with gradient boosting tree model has achieved great success in molecular solvation free energy prediction. The results show that our model is effective in characterizing the molecular structures, demonstrating the power of our molecular representation and featurization approach. Nature Publishing Group UK 2023-07-11 /pmc/articles/PMC10336089/ /pubmed/37433870 http://dx.doi.org/10.1038/s41598-023-37853-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Wee, Junjie Bianconi, Ginestra Xia, Kelin Persistent Dirac for molecular representation |
title | Persistent Dirac for molecular representation |
title_full | Persistent Dirac for molecular representation |
title_fullStr | Persistent Dirac for molecular representation |
title_full_unstemmed | Persistent Dirac for molecular representation |
title_short | Persistent Dirac for molecular representation |
title_sort | persistent dirac for molecular representation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10336089/ https://www.ncbi.nlm.nih.gov/pubmed/37433870 http://dx.doi.org/10.1038/s41598-023-37853-z |
work_keys_str_mv | AT weejunjie persistentdiracformolecularrepresentation AT bianconiginestra persistentdiracformolecularrepresentation AT xiakelin persistentdiracformolecularrepresentation |