Cargando…

Identifying Chemical Reactions and Their Associated Attributes in Patents

Chemical patents are an essential source of information about novel chemicals and chemical reactions. However, with the increasing volume of such patents, mining information about these chemicals and chemical reactions has become a time-intensive and laborious endeavor. In this study, we present a s...

Descripción completa

Detalles Bibliográficos
Autores principales: Mahendran, Darshini, Gurdin, Gabrielle, Lewinski, Nastassja, Tang, Christina, McInnes, Bridget T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8312343/
https://www.ncbi.nlm.nih.gov/pubmed/34322654
http://dx.doi.org/10.3389/frma.2021.688353
_version_ 1783729128357232640
author Mahendran, Darshini
Gurdin, Gabrielle
Lewinski, Nastassja
Tang, Christina
McInnes, Bridget T.
author_facet Mahendran, Darshini
Gurdin, Gabrielle
Lewinski, Nastassja
Tang, Christina
McInnes, Bridget T.
author_sort Mahendran, Darshini
collection PubMed
description Chemical patents are an essential source of information about novel chemicals and chemical reactions. However, with the increasing volume of such patents, mining information about these chemicals and chemical reactions has become a time-intensive and laborious endeavor. In this study, we present a system to extract chemical reaction events from patents automatically. Our approach consists of two steps: 1) named entity recognition (NER)—the automatic identification of chemical reaction parameters from the corresponding text, and 2) event extraction (EE)—the automatic classifying and linking of entities based on their relationships to each other. For our NER system, we evaluate bidirectional long short-term memory (BiLSTM)-based and bidirectional encoder representations from transformer (BERT)-based methods. For our EE system, we evaluate BERT-based, convolutional neural network (CNN)-based, and rule-based methods. We evaluate our NER and EE components independently and as an end-to-end system, reporting the precision, recall, and F (1) score. Our results show that the BiLSTM-based method performed best at identifying the entities, and the CNN-based method performed best at extracting events.
format Online
Article
Text
id pubmed-8312343
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-83123432021-07-27 Identifying Chemical Reactions and Their Associated Attributes in Patents Mahendran, Darshini Gurdin, Gabrielle Lewinski, Nastassja Tang, Christina McInnes, Bridget T. Front Res Metr Anal Research Metrics and Analytics Chemical patents are an essential source of information about novel chemicals and chemical reactions. However, with the increasing volume of such patents, mining information about these chemicals and chemical reactions has become a time-intensive and laborious endeavor. In this study, we present a system to extract chemical reaction events from patents automatically. Our approach consists of two steps: 1) named entity recognition (NER)—the automatic identification of chemical reaction parameters from the corresponding text, and 2) event extraction (EE)—the automatic classifying and linking of entities based on their relationships to each other. For our NER system, we evaluate bidirectional long short-term memory (BiLSTM)-based and bidirectional encoder representations from transformer (BERT)-based methods. For our EE system, we evaluate BERT-based, convolutional neural network (CNN)-based, and rule-based methods. We evaluate our NER and EE components independently and as an end-to-end system, reporting the precision, recall, and F (1) score. Our results show that the BiLSTM-based method performed best at identifying the entities, and the CNN-based method performed best at extracting events. Frontiers Media S.A. 2021-07-12 /pmc/articles/PMC8312343/ /pubmed/34322654 http://dx.doi.org/10.3389/frma.2021.688353 Text en Copyright © 2021 Mahendran, Gurdin, Lewinski, Tang and McInnes. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Research Metrics and Analytics
Mahendran, Darshini
Gurdin, Gabrielle
Lewinski, Nastassja
Tang, Christina
McInnes, Bridget T.
Identifying Chemical Reactions and Their Associated Attributes in Patents
title Identifying Chemical Reactions and Their Associated Attributes in Patents
title_full Identifying Chemical Reactions and Their Associated Attributes in Patents
title_fullStr Identifying Chemical Reactions and Their Associated Attributes in Patents
title_full_unstemmed Identifying Chemical Reactions and Their Associated Attributes in Patents
title_short Identifying Chemical Reactions and Their Associated Attributes in Patents
title_sort identifying chemical reactions and their associated attributes in patents
topic Research Metrics and Analytics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8312343/
https://www.ncbi.nlm.nih.gov/pubmed/34322654
http://dx.doi.org/10.3389/frma.2021.688353
work_keys_str_mv AT mahendrandarshini identifyingchemicalreactionsandtheirassociatedattributesinpatents
AT gurdingabrielle identifyingchemicalreactionsandtheirassociatedattributesinpatents
AT lewinskinastassja identifyingchemicalreactionsandtheirassociatedattributesinpatents
AT tangchristina identifyingchemicalreactionsandtheirassociatedattributesinpatents
AT mcinnesbridgett identifyingchemicalreactionsandtheirassociatedattributesinpatents