Cargando…

Boosting automatic event extraction from the literature using domain adaptation and coreference resolution

Motivation: In recent years, several biomedical event extraction (EE) systems have been developed. However, the nature of the annotated training corpora, as well as the training process itself, can limit the performance levels of the trained EE systems. In particular, most event-annotated corpora do...

Descripción completa

Detalles Bibliográficos
Autores principales: Miwa, Makoto, Thompson, Paul, Ananiadou, Sophia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3381963/
https://www.ncbi.nlm.nih.gov/pubmed/22539668
http://dx.doi.org/10.1093/bioinformatics/bts237
_version_ 1782236446481973248
author Miwa, Makoto
Thompson, Paul
Ananiadou, Sophia
author_facet Miwa, Makoto
Thompson, Paul
Ananiadou, Sophia
author_sort Miwa, Makoto
collection PubMed
description Motivation: In recent years, several biomedical event extraction (EE) systems have been developed. However, the nature of the annotated training corpora, as well as the training process itself, can limit the performance levels of the trained EE systems. In particular, most event-annotated corpora do not deal adequately with coreference. This impacts on the trained systems' ability to recognize biomedical entities, thus affecting their performance in extracting events accurately. Additionally, the fact that most EE systems are trained on a single annotated corpus further restricts their coverage. Results: We have enhanced our existing EE system, EventMine, in two ways. First, we developed a new coreference resolution (CR) system and integrated it with EventMine. The standalone performance of our CR system in resolving anaphoric references to proteins is considerably higher than the best ranked system in the COREF subtask of the BioNLP'11 Shared Task. Secondly, the improved EventMine incorporates domain adaptation (DA) methods, which extend EE coverage by allowing several different annotated corpora to be used during training. Combined with a novel set of methods to increase the generality and efficiency of EventMine, the integration of both CR and DA have resulted in significant improvements in EE, ranging between 0.5% and 3.4% F-Score. The enhanced EventMine outperforms the highest ranked systems from the BioNLP'09 shared task, and from the GENIA and Infectious Diseases subtasks of the BioNLP'11 shared task. Availability: The improved version of EventMine, incorporating the CR system and DA methods, is available at: http://www.nactem.ac.uk/EventMine/. Contact: makoto.miwa@manchester.ac.uk
format Online
Article
Text
id pubmed-3381963
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33819632012-06-25 Boosting automatic event extraction from the literature using domain adaptation and coreference resolution Miwa, Makoto Thompson, Paul Ananiadou, Sophia Bioinformatics Original Papers Motivation: In recent years, several biomedical event extraction (EE) systems have been developed. However, the nature of the annotated training corpora, as well as the training process itself, can limit the performance levels of the trained EE systems. In particular, most event-annotated corpora do not deal adequately with coreference. This impacts on the trained systems' ability to recognize biomedical entities, thus affecting their performance in extracting events accurately. Additionally, the fact that most EE systems are trained on a single annotated corpus further restricts their coverage. Results: We have enhanced our existing EE system, EventMine, in two ways. First, we developed a new coreference resolution (CR) system and integrated it with EventMine. The standalone performance of our CR system in resolving anaphoric references to proteins is considerably higher than the best ranked system in the COREF subtask of the BioNLP'11 Shared Task. Secondly, the improved EventMine incorporates domain adaptation (DA) methods, which extend EE coverage by allowing several different annotated corpora to be used during training. Combined with a novel set of methods to increase the generality and efficiency of EventMine, the integration of both CR and DA have resulted in significant improvements in EE, ranging between 0.5% and 3.4% F-Score. The enhanced EventMine outperforms the highest ranked systems from the BioNLP'09 shared task, and from the GENIA and Infectious Diseases subtasks of the BioNLP'11 shared task. Availability: The improved version of EventMine, incorporating the CR system and DA methods, is available at: http://www.nactem.ac.uk/EventMine/. Contact: makoto.miwa@manchester.ac.uk Oxford University Press 2012-07-01 2012-04-25 /pmc/articles/PMC3381963/ /pubmed/22539668 http://dx.doi.org/10.1093/bioinformatics/bts237 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Miwa, Makoto
Thompson, Paul
Ananiadou, Sophia
Boosting automatic event extraction from the literature using domain adaptation and coreference resolution
title Boosting automatic event extraction from the literature using domain adaptation and coreference resolution
title_full Boosting automatic event extraction from the literature using domain adaptation and coreference resolution
title_fullStr Boosting automatic event extraction from the literature using domain adaptation and coreference resolution
title_full_unstemmed Boosting automatic event extraction from the literature using domain adaptation and coreference resolution
title_short Boosting automatic event extraction from the literature using domain adaptation and coreference resolution
title_sort boosting automatic event extraction from the literature using domain adaptation and coreference resolution
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3381963/
https://www.ncbi.nlm.nih.gov/pubmed/22539668
http://dx.doi.org/10.1093/bioinformatics/bts237
work_keys_str_mv AT miwamakoto boostingautomaticeventextractionfromtheliteratureusingdomainadaptationandcoreferenceresolution
AT thompsonpaul boostingautomaticeventextractionfromtheliteratureusingdomainadaptationandcoreferenceresolution
AT ananiadousophia boostingautomaticeventextractionfromtheliteratureusingdomainadaptationandcoreferenceresolution