Cargando…
Img2Mol – accurate SMILES recognition from molecular graphical depictions
The automatic recognition of the molecular content of a molecule's graphical depiction is an extremely challenging problem that remains largely unsolved despite decades of research. Recent advances in neural machine translation enable the auto-encoding of molecular structures in a continuous ve...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society of Chemistry
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8565361/ https://www.ncbi.nlm.nih.gov/pubmed/34760202 http://dx.doi.org/10.1039/d1sc01839f |
_version_ | 1784593809676435456 |
---|---|
author | Clevert, Djork-Arné Le, Tuan Winter, Robin Montanari, Floriane |
author_facet | Clevert, Djork-Arné Le, Tuan Winter, Robin Montanari, Floriane |
author_sort | Clevert, Djork-Arné |
collection | PubMed |
description | The automatic recognition of the molecular content of a molecule's graphical depiction is an extremely challenging problem that remains largely unsolved despite decades of research. Recent advances in neural machine translation enable the auto-encoding of molecular structures in a continuous vector space of fixed size (latent representation) with low reconstruction errors. In this paper, we present a fast and accurate model combining deep convolutional neural network learning from molecule depictions and a pre-trained decoder that translates the latent representation into the SMILES representation of the molecules. This combination allows us to precisely infer a molecular structure from an image. Our rigorous evaluation shows that Img2Mol is able to correctly translate up to 88% of the molecular depictions into their SMILES representation. A pretrained version of Img2Mol is made publicly available on GitHub for non-commercial users. |
format | Online Article Text |
id | pubmed-8565361 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | The Royal Society of Chemistry |
record_format | MEDLINE/PubMed |
spelling | pubmed-85653612021-11-09 Img2Mol – accurate SMILES recognition from molecular graphical depictions Clevert, Djork-Arné Le, Tuan Winter, Robin Montanari, Floriane Chem Sci Chemistry The automatic recognition of the molecular content of a molecule's graphical depiction is an extremely challenging problem that remains largely unsolved despite decades of research. Recent advances in neural machine translation enable the auto-encoding of molecular structures in a continuous vector space of fixed size (latent representation) with low reconstruction errors. In this paper, we present a fast and accurate model combining deep convolutional neural network learning from molecule depictions and a pre-trained decoder that translates the latent representation into the SMILES representation of the molecules. This combination allows us to precisely infer a molecular structure from an image. Our rigorous evaluation shows that Img2Mol is able to correctly translate up to 88% of the molecular depictions into their SMILES representation. A pretrained version of Img2Mol is made publicly available on GitHub for non-commercial users. The Royal Society of Chemistry 2021-09-29 /pmc/articles/PMC8565361/ /pubmed/34760202 http://dx.doi.org/10.1039/d1sc01839f Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/ |
spellingShingle | Chemistry Clevert, Djork-Arné Le, Tuan Winter, Robin Montanari, Floriane Img2Mol – accurate SMILES recognition from molecular graphical depictions |
title | Img2Mol – accurate SMILES recognition from molecular graphical depictions |
title_full | Img2Mol – accurate SMILES recognition from molecular graphical depictions |
title_fullStr | Img2Mol – accurate SMILES recognition from molecular graphical depictions |
title_full_unstemmed | Img2Mol – accurate SMILES recognition from molecular graphical depictions |
title_short | Img2Mol – accurate SMILES recognition from molecular graphical depictions |
title_sort | img2mol – accurate smiles recognition from molecular graphical depictions |
topic | Chemistry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8565361/ https://www.ncbi.nlm.nih.gov/pubmed/34760202 http://dx.doi.org/10.1039/d1sc01839f |
work_keys_str_mv | AT clevertdjorkarne img2molaccuratesmilesrecognitionfrommoleculargraphicaldepictions AT letuan img2molaccuratesmilesrecognitionfrommoleculargraphicaldepictions AT winterrobin img2molaccuratesmilesrecognitionfrommoleculargraphicaldepictions AT montanarifloriane img2molaccuratesmilesrecognitionfrommoleculargraphicaldepictions |