Cargando…

Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks

[Image: see text] Spectroscopic methods—like nuclear magnetic resonance, mass spectrometry, X-ray diffraction, and UV/visible spectroscopies—applied to molecular ensembles have so far been the workhorse for molecular identification. Here, we propose a radically different chemical characterization ap...

Descripción completa

Detalles Bibliográficos
Autores principales: Carracedo-Cosme, Jaime, Romero-Muñiz, Carlos, Pou, Pablo, Pérez, Rubén
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10176476/
https://www.ncbi.nlm.nih.gov/pubmed/37126486
http://dx.doi.org/10.1021/acsami.3c01550
_version_ 1785040440330813440
author Carracedo-Cosme, Jaime
Romero-Muñiz, Carlos
Pou, Pablo
Pérez, Rubén
author_facet Carracedo-Cosme, Jaime
Romero-Muñiz, Carlos
Pou, Pablo
Pérez, Rubén
author_sort Carracedo-Cosme, Jaime
collection PubMed
description [Image: see text] Spectroscopic methods—like nuclear magnetic resonance, mass spectrometry, X-ray diffraction, and UV/visible spectroscopies—applied to molecular ensembles have so far been the workhorse for molecular identification. Here, we propose a radically different chemical characterization approach, based on the ability of noncontact atomic force microscopy with metal tips functionalized with a CO molecule at the tip apex (referred as HR-AFM) to resolve the internal structure of individual molecules. Our work demonstrates that a stack of constant-height HR-AFM images carries enough chemical information for a complete identification (structure and composition) of quasiplanar organic molecules, and that this information can be retrieved using machine learning techniques that are able to disentangle the contribution of chemical composition, bond topology, and internal torsion of the molecule to the HR-AFM contrast. In particular, we exploit multimodal recurrent neural networks (M-RNN) that combine convolutional neural networks for image analysis and recurrent neural networks to deal with language processing, to formulate the molecular identification as an imaging captioning problem. The algorithm is trained using a data set—which contains almost 700,000 molecules and 165 million theoretical AFM images—to produce as final output the IUPAC name of the imaged molecule. Our extensive test with theoretical images and a few experimental ones shows the potential of deep learning algorithms in the automatic identification of molecular compounds by AFM. This achievement supports the development of on-surface synthesis and overcomes some limitations of spectroscopic methods in traditional solution-based synthesis.
format Online
Article
Text
id pubmed-10176476
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-101764762023-05-13 Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks Carracedo-Cosme, Jaime Romero-Muñiz, Carlos Pou, Pablo Pérez, Rubén ACS Appl Mater Interfaces [Image: see text] Spectroscopic methods—like nuclear magnetic resonance, mass spectrometry, X-ray diffraction, and UV/visible spectroscopies—applied to molecular ensembles have so far been the workhorse for molecular identification. Here, we propose a radically different chemical characterization approach, based on the ability of noncontact atomic force microscopy with metal tips functionalized with a CO molecule at the tip apex (referred as HR-AFM) to resolve the internal structure of individual molecules. Our work demonstrates that a stack of constant-height HR-AFM images carries enough chemical information for a complete identification (structure and composition) of quasiplanar organic molecules, and that this information can be retrieved using machine learning techniques that are able to disentangle the contribution of chemical composition, bond topology, and internal torsion of the molecule to the HR-AFM contrast. In particular, we exploit multimodal recurrent neural networks (M-RNN) that combine convolutional neural networks for image analysis and recurrent neural networks to deal with language processing, to formulate the molecular identification as an imaging captioning problem. The algorithm is trained using a data set—which contains almost 700,000 molecules and 165 million theoretical AFM images—to produce as final output the IUPAC name of the imaged molecule. Our extensive test with theoretical images and a few experimental ones shows the potential of deep learning algorithms in the automatic identification of molecular compounds by AFM. This achievement supports the development of on-surface synthesis and overcomes some limitations of spectroscopic methods in traditional solution-based synthesis. American Chemical Society 2023-05-01 /pmc/articles/PMC10176476/ /pubmed/37126486 http://dx.doi.org/10.1021/acsami.3c01550 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Carracedo-Cosme, Jaime
Romero-Muñiz, Carlos
Pou, Pablo
Pérez, Rubén
Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks
title Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks
title_full Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks
title_fullStr Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks
title_full_unstemmed Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks
title_short Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks
title_sort molecular identification from afm images using the iupac nomenclature and attribute multimodal recurrent neural networks
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10176476/
https://www.ncbi.nlm.nih.gov/pubmed/37126486
http://dx.doi.org/10.1021/acsami.3c01550
work_keys_str_mv AT carracedocosmejaime molecularidentificationfromafmimagesusingtheiupacnomenclatureandattributemultimodalrecurrentneuralnetworks
AT romeromunizcarlos molecularidentificationfromafmimagesusingtheiupacnomenclatureandattributemultimodalrecurrentneuralnetworks
AT poupablo molecularidentificationfromafmimagesusingtheiupacnomenclatureandattributemultimodalrecurrentneuralnetworks
AT perezruben molecularidentificationfromafmimagesusingtheiupacnomenclatureandattributemultimodalrecurrentneuralnetworks