Cargando…

Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011

We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions...

Descripción completa

Detalles Bibliográficos
Autores principales: Pyysalo, Sampo, Ohta, Tomoko, Rak, Rafal, Sullivan, Dan, Mao, Chunhong, Wang, Chunxia, Sobral, Bruno, Tsujii, Jun'ichi, Ananiadou, Sophia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384257/
https://www.ncbi.nlm.nih.gov/pubmed/22759456
http://dx.doi.org/10.1186/1471-2105-13-S11-S2
_version_ 1782236684734169088
author Pyysalo, Sampo
Ohta, Tomoko
Rak, Rafal
Sullivan, Dan
Mao, Chunhong
Wang, Chunxia
Sobral, Bruno
Tsujii, Jun'ichi
Ananiadou, Sophia
author_facet Pyysalo, Sampo
Ohta, Tomoko
Rak, Rafal
Sullivan, Dan
Mao, Chunhong
Wang, Chunxia
Sobral, Bruno
Tsujii, Jun'ichi
Ananiadou, Sophia
author_sort Pyysalo, Sampo
collection PubMed
description We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties.
format Online
Article
Text
id pubmed-3384257
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33842572012-06-28 Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011 Pyysalo, Sampo Ohta, Tomoko Rak, Rafal Sullivan, Dan Mao, Chunhong Wang, Chunxia Sobral, Bruno Tsujii, Jun'ichi Ananiadou, Sophia BMC Bioinformatics Proceedings We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties. BioMed Central 2012-06-26 /pmc/articles/PMC3384257/ /pubmed/22759456 http://dx.doi.org/10.1186/1471-2105-13-S11-S2 Text en Copyright ©2012 Pyysalo et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Pyysalo, Sampo
Ohta, Tomoko
Rak, Rafal
Sullivan, Dan
Mao, Chunhong
Wang, Chunxia
Sobral, Bruno
Tsujii, Jun'ichi
Ananiadou, Sophia
Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_full Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_fullStr Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_full_unstemmed Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_short Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_sort overview of the id, epi and rel tasks of bionlp shared task 2011
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384257/
https://www.ncbi.nlm.nih.gov/pubmed/22759456
http://dx.doi.org/10.1186/1471-2105-13-S11-S2
work_keys_str_mv AT pyysalosampo overviewoftheidepiandreltasksofbionlpsharedtask2011
AT ohtatomoko overviewoftheidepiandreltasksofbionlpsharedtask2011
AT rakrafal overviewoftheidepiandreltasksofbionlpsharedtask2011
AT sullivandan overviewoftheidepiandreltasksofbionlpsharedtask2011
AT maochunhong overviewoftheidepiandreltasksofbionlpsharedtask2011
AT wangchunxia overviewoftheidepiandreltasksofbionlpsharedtask2011
AT sobralbruno overviewoftheidepiandreltasksofbionlpsharedtask2011
AT tsujiijunichi overviewoftheidepiandreltasksofbionlpsharedtask2011
AT ananiadousophia overviewoftheidepiandreltasksofbionlpsharedtask2011