Cargando…

Deep learning-based automatic action extraction from structured chemical synthesis procedures

This article proposes a methodology that uses machine learning algorithms to extract actions from structured chemical synthesis procedures, thereby bridging the gap between chemistry and natural language processing. The proposed pipeline combines ML algorithms and scripts to extract relevant data fr...

Descripción completa

Detalles Bibliográficos
Autores principales: Vaškevičius, Mantas, Kapočiūtė-Dzikienė, Jurgita, Vaškevičius, Arnas, Šlepikas, Liudas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495970/
https://www.ncbi.nlm.nih.gov/pubmed/37705639
http://dx.doi.org/10.7717/peerj-cs.1511
_version_ 1785105007876505600
author Vaškevičius, Mantas
Kapočiūtė-Dzikienė, Jurgita
Vaškevičius, Arnas
Šlepikas, Liudas
author_facet Vaškevičius, Mantas
Kapočiūtė-Dzikienė, Jurgita
Vaškevičius, Arnas
Šlepikas, Liudas
author_sort Vaškevičius, Mantas
collection PubMed
description This article proposes a methodology that uses machine learning algorithms to extract actions from structured chemical synthesis procedures, thereby bridging the gap between chemistry and natural language processing. The proposed pipeline combines ML algorithms and scripts to extract relevant data from USPTO and EPO patents, which helps transform experimental procedures into structured actions. This pipeline includes two primary tasks: classifying patent paragraphs to select chemical procedures and converting chemical procedure sentences into a structured, simplified format. We employ artificial neural networks such as long short-term memory, bidirectional LSTMs, transformers, and fine-tuned T5. Our results show that the bidirectional LSTM classifier achieved the highest accuracy of 0.939 in the first task, while the Transformer model attained the highest BLEU score of 0.951 in the second task. The developed pipeline enables the creation of a dataset of chemical reactions and their procedures in a structured format, facilitating the application of AI-based approaches to streamline synthetic pathways, predict reaction outcomes, and optimize experimental conditions. Furthermore, the developed pipeline allows for creating a structured dataset of chemical reactions and procedures, making it easier for researchers to access and utilize the valuable information in synthesis procedures.
format Online
Article
Text
id pubmed-10495970
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-104959702023-09-13 Deep learning-based automatic action extraction from structured chemical synthesis procedures Vaškevičius, Mantas Kapočiūtė-Dzikienė, Jurgita Vaškevičius, Arnas Šlepikas, Liudas PeerJ Comput Sci Artificial Intelligence This article proposes a methodology that uses machine learning algorithms to extract actions from structured chemical synthesis procedures, thereby bridging the gap between chemistry and natural language processing. The proposed pipeline combines ML algorithms and scripts to extract relevant data from USPTO and EPO patents, which helps transform experimental procedures into structured actions. This pipeline includes two primary tasks: classifying patent paragraphs to select chemical procedures and converting chemical procedure sentences into a structured, simplified format. We employ artificial neural networks such as long short-term memory, bidirectional LSTMs, transformers, and fine-tuned T5. Our results show that the bidirectional LSTM classifier achieved the highest accuracy of 0.939 in the first task, while the Transformer model attained the highest BLEU score of 0.951 in the second task. The developed pipeline enables the creation of a dataset of chemical reactions and their procedures in a structured format, facilitating the application of AI-based approaches to streamline synthetic pathways, predict reaction outcomes, and optimize experimental conditions. Furthermore, the developed pipeline allows for creating a structured dataset of chemical reactions and procedures, making it easier for researchers to access and utilize the valuable information in synthesis procedures. PeerJ Inc. 2023-08-18 /pmc/articles/PMC10495970/ /pubmed/37705639 http://dx.doi.org/10.7717/peerj-cs.1511 Text en © 2023 Vaškevičius et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Artificial Intelligence
Vaškevičius, Mantas
Kapočiūtė-Dzikienė, Jurgita
Vaškevičius, Arnas
Šlepikas, Liudas
Deep learning-based automatic action extraction from structured chemical synthesis procedures
title Deep learning-based automatic action extraction from structured chemical synthesis procedures
title_full Deep learning-based automatic action extraction from structured chemical synthesis procedures
title_fullStr Deep learning-based automatic action extraction from structured chemical synthesis procedures
title_full_unstemmed Deep learning-based automatic action extraction from structured chemical synthesis procedures
title_short Deep learning-based automatic action extraction from structured chemical synthesis procedures
title_sort deep learning-based automatic action extraction from structured chemical synthesis procedures
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495970/
https://www.ncbi.nlm.nih.gov/pubmed/37705639
http://dx.doi.org/10.7717/peerj-cs.1511
work_keys_str_mv AT vaskeviciusmantas deeplearningbasedautomaticactionextractionfromstructuredchemicalsynthesisprocedures
AT kapociutedzikienejurgita deeplearningbasedautomaticactionextractionfromstructuredchemicalsynthesisprocedures
AT vaskeviciusarnas deeplearningbasedautomaticactionextractionfromstructuredchemicalsynthesisprocedures
AT slepikasliudas deeplearningbasedautomaticactionextractionfromstructuredchemicalsynthesisprocedures