Cargando…
Root-aligned SMILES: a tight representation for chemical reaction prediction
Chemical reaction prediction, involving forward synthesis and retrosynthesis prediction, is a fundamental problem in organic synthesis. A popular computational paradigm formulates synthesis prediction as a sequence-to-sequence translation problem, where the typical SMILES is adopted for molecule rep...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society of Chemistry
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9365080/ https://www.ncbi.nlm.nih.gov/pubmed/36091202 http://dx.doi.org/10.1039/d2sc02763a |
_version_ | 1784765269637332992 |
---|---|
author | Zhong, Zipeng Song, Jie Feng, Zunlei Liu, Tiantao Jia, Lingxiang Yao, Shaolun Wu, Min Hou, Tingjun Song, Mingli |
author_facet | Zhong, Zipeng Song, Jie Feng, Zunlei Liu, Tiantao Jia, Lingxiang Yao, Shaolun Wu, Min Hou, Tingjun Song, Mingli |
author_sort | Zhong, Zipeng |
collection | PubMed |
description | Chemical reaction prediction, involving forward synthesis and retrosynthesis prediction, is a fundamental problem in organic synthesis. A popular computational paradigm formulates synthesis prediction as a sequence-to-sequence translation problem, where the typical SMILES is adopted for molecule representations. However, the general-purpose SMILES neglects the characteristics of chemical reactions, where the molecular graph topology is largely unaltered from reactants to products, resulting in the suboptimal performance of SMILES if straightforwardly applied. In this article, we propose the root-aligned SMILES (R-SMILES), which specifies a tightly aligned one-to-one mapping between the product and the reactant SMILES for more efficient synthesis prediction. Due to the strict one-to-one mapping and reduced edit distance, the computational model is largely relieved from learning the complex syntax and dedicated to learning the chemical knowledge for reactions. We compare the proposed R-SMILES with various state-of-the-art baselines and show that it significantly outperforms them all, demonstrating the superiority of the proposed method. |
format | Online Article Text |
id | pubmed-9365080 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | The Royal Society of Chemistry |
record_format | MEDLINE/PubMed |
spelling | pubmed-93650802022-09-08 Root-aligned SMILES: a tight representation for chemical reaction prediction Zhong, Zipeng Song, Jie Feng, Zunlei Liu, Tiantao Jia, Lingxiang Yao, Shaolun Wu, Min Hou, Tingjun Song, Mingli Chem Sci Chemistry Chemical reaction prediction, involving forward synthesis and retrosynthesis prediction, is a fundamental problem in organic synthesis. A popular computational paradigm formulates synthesis prediction as a sequence-to-sequence translation problem, where the typical SMILES is adopted for molecule representations. However, the general-purpose SMILES neglects the characteristics of chemical reactions, where the molecular graph topology is largely unaltered from reactants to products, resulting in the suboptimal performance of SMILES if straightforwardly applied. In this article, we propose the root-aligned SMILES (R-SMILES), which specifies a tightly aligned one-to-one mapping between the product and the reactant SMILES for more efficient synthesis prediction. Due to the strict one-to-one mapping and reduced edit distance, the computational model is largely relieved from learning the complex syntax and dedicated to learning the chemical knowledge for reactions. We compare the proposed R-SMILES with various state-of-the-art baselines and show that it significantly outperforms them all, demonstrating the superiority of the proposed method. The Royal Society of Chemistry 2022-07-12 /pmc/articles/PMC9365080/ /pubmed/36091202 http://dx.doi.org/10.1039/d2sc02763a Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/ |
spellingShingle | Chemistry Zhong, Zipeng Song, Jie Feng, Zunlei Liu, Tiantao Jia, Lingxiang Yao, Shaolun Wu, Min Hou, Tingjun Song, Mingli Root-aligned SMILES: a tight representation for chemical reaction prediction |
title | Root-aligned SMILES: a tight representation for chemical reaction prediction |
title_full | Root-aligned SMILES: a tight representation for chemical reaction prediction |
title_fullStr | Root-aligned SMILES: a tight representation for chemical reaction prediction |
title_full_unstemmed | Root-aligned SMILES: a tight representation for chemical reaction prediction |
title_short | Root-aligned SMILES: a tight representation for chemical reaction prediction |
title_sort | root-aligned smiles: a tight representation for chemical reaction prediction |
topic | Chemistry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9365080/ https://www.ncbi.nlm.nih.gov/pubmed/36091202 http://dx.doi.org/10.1039/d2sc02763a |
work_keys_str_mv | AT zhongzipeng rootalignedsmilesatightrepresentationforchemicalreactionprediction AT songjie rootalignedsmilesatightrepresentationforchemicalreactionprediction AT fengzunlei rootalignedsmilesatightrepresentationforchemicalreactionprediction AT liutiantao rootalignedsmilesatightrepresentationforchemicalreactionprediction AT jialingxiang rootalignedsmilesatightrepresentationforchemicalreactionprediction AT yaoshaolun rootalignedsmilesatightrepresentationforchemicalreactionprediction AT wumin rootalignedsmilesatightrepresentationforchemicalreactionprediction AT houtingjun rootalignedsmilesatightrepresentationforchemicalreactionprediction AT songmingli rootalignedsmilesatightrepresentationforchemicalreactionprediction |