Cargando…
Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384257/ https://www.ncbi.nlm.nih.gov/pubmed/22759456 http://dx.doi.org/10.1186/1471-2105-13-S11-S2 |
_version_ | 1782236684734169088 |
---|---|
author | Pyysalo, Sampo Ohta, Tomoko Rak, Rafal Sullivan, Dan Mao, Chunhong Wang, Chunxia Sobral, Bruno Tsujii, Jun'ichi Ananiadou, Sophia |
author_facet | Pyysalo, Sampo Ohta, Tomoko Rak, Rafal Sullivan, Dan Mao, Chunhong Wang, Chunxia Sobral, Bruno Tsujii, Jun'ichi Ananiadou, Sophia |
author_sort | Pyysalo, Sampo |
collection | PubMed |
description | We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties. |
format | Online Article Text |
id | pubmed-3384257 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-33842572012-06-28 Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011 Pyysalo, Sampo Ohta, Tomoko Rak, Rafal Sullivan, Dan Mao, Chunhong Wang, Chunxia Sobral, Bruno Tsujii, Jun'ichi Ananiadou, Sophia BMC Bioinformatics Proceedings We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties. BioMed Central 2012-06-26 /pmc/articles/PMC3384257/ /pubmed/22759456 http://dx.doi.org/10.1186/1471-2105-13-S11-S2 Text en Copyright ©2012 Pyysalo et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Pyysalo, Sampo Ohta, Tomoko Rak, Rafal Sullivan, Dan Mao, Chunhong Wang, Chunxia Sobral, Bruno Tsujii, Jun'ichi Ananiadou, Sophia Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011 |
title | Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011 |
title_full | Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011 |
title_fullStr | Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011 |
title_full_unstemmed | Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011 |
title_short | Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011 |
title_sort | overview of the id, epi and rel tasks of bionlp shared task 2011 |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384257/ https://www.ncbi.nlm.nih.gov/pubmed/22759456 http://dx.doi.org/10.1186/1471-2105-13-S11-S2 |
work_keys_str_mv | AT pyysalosampo overviewoftheidepiandreltasksofbionlpsharedtask2011 AT ohtatomoko overviewoftheidepiandreltasksofbionlpsharedtask2011 AT rakrafal overviewoftheidepiandreltasksofbionlpsharedtask2011 AT sullivandan overviewoftheidepiandreltasksofbionlpsharedtask2011 AT maochunhong overviewoftheidepiandreltasksofbionlpsharedtask2011 AT wangchunxia overviewoftheidepiandreltasksofbionlpsharedtask2011 AT sobralbruno overviewoftheidepiandreltasksofbionlpsharedtask2011 AT tsujiijunichi overviewoftheidepiandreltasksofbionlpsharedtask2011 AT ananiadousophia overviewoftheidepiandreltasksofbionlpsharedtask2011 |