Cargando…

From POS tagging to dependency parsing for biomedical event extraction

BACKGROUND: Given the importance of relation or event extraction from biomedical research publications to support knowledge capture and synthesis, and the strong dependency of approaches to this information extraction task on syntactic information, it is valuable to understand which approaches to sy...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Dat Quoc, Verspoor, Karin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6373122/
https://www.ncbi.nlm.nih.gov/pubmed/30755172
http://dx.doi.org/10.1186/s12859-019-2604-0
_version_ 1783394910287691776
author Nguyen, Dat Quoc
Verspoor, Karin
author_facet Nguyen, Dat Quoc
Verspoor, Karin
author_sort Nguyen, Dat Quoc
collection PubMed
description BACKGROUND: Given the importance of relation or event extraction from biomedical research publications to support knowledge capture and synthesis, and the strong dependency of approaches to this information extraction task on syntactic information, it is valuable to understand which approaches to syntactic processing of biomedical text have the highest performance. RESULTS: We perform an empirical study comparing state-of-the-art traditional feature-based and neural network-based models for two core natural language processing tasks of part-of-speech (POS) tagging and dependency parsing on two benchmark biomedical corpora, GENIA and CRAFT. To the best of our knowledge, there is no recent work making such comparisons in the biomedical context; specifically no detailed analysis of neural models on this data is available. Experimental results show that in general, the neural models outperform the feature-based models on two benchmark biomedical corpora GENIA and CRAFT. We also perform a task-oriented evaluation to investigate the influences of these models in a downstream application on biomedical event extraction, and show that better intrinsic parsing performance does not always imply better extrinsic event extraction performance. CONCLUSION: We have presented a detailed empirical study comparing traditional feature-based and neural network-based models for POS tagging and dependency parsing in the biomedical context, and also investigated the influence of parser selection for a biomedical event extraction downstream task. AVAILABILITY OF DATA AND MATERIALS: We make the retrained models available at https://github.com/datquocnguyen/BioPosDep.
format Online
Article
Text
id pubmed-6373122
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63731222019-02-25 From POS tagging to dependency parsing for biomedical event extraction Nguyen, Dat Quoc Verspoor, Karin BMC Bioinformatics Research Article BACKGROUND: Given the importance of relation or event extraction from biomedical research publications to support knowledge capture and synthesis, and the strong dependency of approaches to this information extraction task on syntactic information, it is valuable to understand which approaches to syntactic processing of biomedical text have the highest performance. RESULTS: We perform an empirical study comparing state-of-the-art traditional feature-based and neural network-based models for two core natural language processing tasks of part-of-speech (POS) tagging and dependency parsing on two benchmark biomedical corpora, GENIA and CRAFT. To the best of our knowledge, there is no recent work making such comparisons in the biomedical context; specifically no detailed analysis of neural models on this data is available. Experimental results show that in general, the neural models outperform the feature-based models on two benchmark biomedical corpora GENIA and CRAFT. We also perform a task-oriented evaluation to investigate the influences of these models in a downstream application on biomedical event extraction, and show that better intrinsic parsing performance does not always imply better extrinsic event extraction performance. CONCLUSION: We have presented a detailed empirical study comparing traditional feature-based and neural network-based models for POS tagging and dependency parsing in the biomedical context, and also investigated the influence of parser selection for a biomedical event extraction downstream task. AVAILABILITY OF DATA AND MATERIALS: We make the retrained models available at https://github.com/datquocnguyen/BioPosDep. BioMed Central 2019-02-12 /pmc/articles/PMC6373122/ /pubmed/30755172 http://dx.doi.org/10.1186/s12859-019-2604-0 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Nguyen, Dat Quoc
Verspoor, Karin
From POS tagging to dependency parsing for biomedical event extraction
title From POS tagging to dependency parsing for biomedical event extraction
title_full From POS tagging to dependency parsing for biomedical event extraction
title_fullStr From POS tagging to dependency parsing for biomedical event extraction
title_full_unstemmed From POS tagging to dependency parsing for biomedical event extraction
title_short From POS tagging to dependency parsing for biomedical event extraction
title_sort from pos tagging to dependency parsing for biomedical event extraction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6373122/
https://www.ncbi.nlm.nih.gov/pubmed/30755172
http://dx.doi.org/10.1186/s12859-019-2604-0
work_keys_str_mv AT nguyendatquoc frompostaggingtodependencyparsingforbiomedicaleventextraction
AT verspoorkarin frompostaggingtodependencyparsingforbiomedicaleventextraction