Cargando…

Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction?

Collecting relations between chemicals and drugs is crucial in biomedical research. The pre-trained transformer model, e.g. Bidirectional Encoder Representations from Transformers (BERT), is shown to have limitations on biomedical texts; more specifically, the lack of annotated data makes relation e...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tang, Anfu, Deléger, Louise, Bossy, Robert, Zweigenbaum, Pierre, Nédellec, Claire
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2022
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9408061/ https://www.ncbi.nlm.nih.gov/pubmed/36006843 http://dx.doi.org/10.1093/database/baac070

_version_	1784774515091308544
author	Tang, Anfu Deléger, Louise Bossy, Robert Zweigenbaum, Pierre Nédellec, Claire
author_facet	Tang, Anfu Deléger, Louise Bossy, Robert Zweigenbaum, Pierre Nédellec, Claire
author_sort	Tang, Anfu
collection	PubMed
description	Collecting relations between chemicals and drugs is crucial in biomedical research. The pre-trained transformer model, e.g. Bidirectional Encoder Representations from Transformers (BERT), is shown to have limitations on biomedical texts; more specifically, the lack of annotated data makes relation extraction (RE) from biomedical texts very challenging. In this paper, we hypothesize that enriching a pre-trained transformer model with syntactic information may help improve its performance on chemical–drug RE tasks. For this purpose, we propose three syntax-enhanced models based on the domain-specific BioBERT model: Chunking-Enhanced-BioBERT and Constituency-Tree-BioBERT in which constituency information is integrated and a Multi-Task-Learning framework Multi-Task-Syntactic (MTS)-BioBERT in which syntactic information is injected implicitly by adding syntax-related tasks as training objectives. Besides, we test an existing model Late-Fusion which is enhanced by syntactic dependency information and build ensemble systems combining syntax-enhanced models and non-syntax-enhanced models. Experiments are conducted on the BioCreative VII DrugProt corpus, a manually annotated corpus for the development and evaluation of RE systems. Our results reveal that syntax-enhanced models in general degrade the performance of BioBERT in the scenario of biomedical RE but improve the performance when the subject–object distance of candidate semantic relation is long. We also explore the impact of quality of dependency parses. [Our code is available at: https://github.com/Maple177/syntax-enhanced-RE/tree/drugprot (for only MTS-BioBERT); https://github.com/Maple177/drugprot-relation-extraction (for the rest of experiments)] Database URL https://github.com/Maple177/drugprot-relation-extraction
format	Online Article Text
id	pubmed-9408061
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-94080612022-08-26 Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction? Tang, Anfu Deléger, Louise Bossy, Robert Zweigenbaum, Pierre Nédellec, Claire Database (Oxford) Original Article Collecting relations between chemicals and drugs is crucial in biomedical research. The pre-trained transformer model, e.g. Bidirectional Encoder Representations from Transformers (BERT), is shown to have limitations on biomedical texts; more specifically, the lack of annotated data makes relation extraction (RE) from biomedical texts very challenging. In this paper, we hypothesize that enriching a pre-trained transformer model with syntactic information may help improve its performance on chemical–drug RE tasks. For this purpose, we propose three syntax-enhanced models based on the domain-specific BioBERT model: Chunking-Enhanced-BioBERT and Constituency-Tree-BioBERT in which constituency information is integrated and a Multi-Task-Learning framework Multi-Task-Syntactic (MTS)-BioBERT in which syntactic information is injected implicitly by adding syntax-related tasks as training objectives. Besides, we test an existing model Late-Fusion which is enhanced by syntactic dependency information and build ensemble systems combining syntax-enhanced models and non-syntax-enhanced models. Experiments are conducted on the BioCreative VII DrugProt corpus, a manually annotated corpus for the development and evaluation of RE systems. Our results reveal that syntax-enhanced models in general degrade the performance of BioBERT in the scenario of biomedical RE but improve the performance when the subject–object distance of candidate semantic relation is long. We also explore the impact of quality of dependency parses. [Our code is available at: https://github.com/Maple177/syntax-enhanced-RE/tree/drugprot (for only MTS-BioBERT); https://github.com/Maple177/drugprot-relation-extraction (for the rest of experiments)] Database URL https://github.com/Maple177/drugprot-relation-extraction Oxford University Press 2022-08-25 /pmc/articles/PMC9408061/ /pubmed/36006843 http://dx.doi.org/10.1093/database/baac070 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Original Article Tang, Anfu Deléger, Louise Bossy, Robert Zweigenbaum, Pierre Nédellec, Claire Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction?
title	Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction?
title_full	Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction?
title_fullStr	Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction?
title_full_unstemmed	Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction?
title_short	Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction?
title_sort	do syntactic trees enhance bidirectional encoder representations from transformers (bert) models for chemical–drug relation extraction?
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9408061/ https://www.ncbi.nlm.nih.gov/pubmed/36006843 http://dx.doi.org/10.1093/database/baac070
work_keys_str_mv	AT tanganfu dosyntactictreesenhancebidirectionalencoderrepresentationsfromtransformersbertmodelsforchemicaldrugrelationextraction AT delegerlouise dosyntactictreesenhancebidirectionalencoderrepresentationsfromtransformersbertmodelsforchemicaldrugrelationextraction AT bossyrobert dosyntactictreesenhancebidirectionalencoderrepresentationsfromtransformersbertmodelsforchemicaldrugrelationextraction AT zweigenbaumpierre dosyntactictreesenhancebidirectionalencoderrepresentationsfromtransformersbertmodelsforchemicaldrugrelationextraction AT nedellecclaire dosyntactictreesenhancebidirectionalencoderrepresentationsfromtransformersbertmodelsforchemicaldrugrelationextraction

Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical–drug relation extraction?

Ejemplares similares