Cargando…

Combining joint models for biomedical event extraction

BACKGROUND: We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the bes...

Descripción completa

Detalles Bibliográficos
Autores principales:	McClosky, David, Riedel, Sebastian, Surdeanu, Mihai, McCallum, Andrew, Manning, Christopher D
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3395172/ https://www.ncbi.nlm.nih.gov/pubmed/22759463 http://dx.doi.org/10.1186/1471-2105-13-S11-S9

_version_	1782237946206748672
author	McClosky, David Riedel, Sebastian Surdeanu, Mihai McCallum, Andrew Manning, Christopher D
author_facet	McClosky, David Riedel, Sebastian Surdeanu, Mihai McCallum, Andrew Manning, Christopher D
author_sort	McClosky, David
collection	PubMed
description	BACKGROUND: We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the best scoring event structure. Our primary focus is on stacking where the predictions from the Stanford system are used as features in the UMass system. For comparison, we look at simpler model combination techniques such as intersection and union which require only the outputs from each system and combine them directly. RESULTS: First, we find that stacking substantially improves performance while intersection and union provide no significant benefits. Second, we investigate the graph properties of event structures and their impact on the combination of our systems. Finally, we trace the origins of events proposed by the stacked model to determine the role each system plays in different components of the output. We learn that, while stacking can propose novel event structures not seen in either base model, these events have extremely low precision. Removing these novel events improves our already state-of-the-art F1 to 56.6% on the test set of Genia (Task 1). Overall, the combined system formed via stacking ("FAUST") performed well in the BioNLP 2011 shared task. The FAUST system obtained 1st place in three out of four tasks: 1st place in Genia Task 1 (56.0% F1) and Task 2 (53.9%), 2nd place in the Epigenetics and Post-translational Modifications track (35.0%), and 1st place in the Infectious Diseases track (55.6%). CONCLUSION: We present a state-of-the-art event extraction system that relies on the strengths of structured prediction and model combination through stacking. Akin to results on other tasks, stacking outperforms intersection and union and leads to very strong results. The utility of model combination hinges on complementary views of the data, and we show that our sub-systems capture different graph properties of event structures. Finally, by removing low precision novel events, we show that performance from stacking can be further improved.
format	Online Article Text
id	pubmed-3395172
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-33951722012-07-16 Combining joint models for biomedical event extraction McClosky, David Riedel, Sebastian Surdeanu, Mihai McCallum, Andrew Manning, Christopher D BMC Bioinformatics Proceedings BACKGROUND: We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the best scoring event structure. Our primary focus is on stacking where the predictions from the Stanford system are used as features in the UMass system. For comparison, we look at simpler model combination techniques such as intersection and union which require only the outputs from each system and combine them directly. RESULTS: First, we find that stacking substantially improves performance while intersection and union provide no significant benefits. Second, we investigate the graph properties of event structures and their impact on the combination of our systems. Finally, we trace the origins of events proposed by the stacked model to determine the role each system plays in different components of the output. We learn that, while stacking can propose novel event structures not seen in either base model, these events have extremely low precision. Removing these novel events improves our already state-of-the-art F1 to 56.6% on the test set of Genia (Task 1). Overall, the combined system formed via stacking ("FAUST") performed well in the BioNLP 2011 shared task. The FAUST system obtained 1st place in three out of four tasks: 1st place in Genia Task 1 (56.0% F1) and Task 2 (53.9%), 2nd place in the Epigenetics and Post-translational Modifications track (35.0%), and 1st place in the Infectious Diseases track (55.6%). CONCLUSION: We present a state-of-the-art event extraction system that relies on the strengths of structured prediction and model combination through stacking. Akin to results on other tasks, stacking outperforms intersection and union and leads to very strong results. The utility of model combination hinges on complementary views of the data, and we show that our sub-systems capture different graph properties of event structures. Finally, by removing low precision novel events, we show that performance from stacking can be further improved. BioMed Central 2012-06-26 /pmc/articles/PMC3395172/ /pubmed/22759463 http://dx.doi.org/10.1186/1471-2105-13-S11-S9 Text en Copyright ©2012 McClosky et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings McClosky, David Riedel, Sebastian Surdeanu, Mihai McCallum, Andrew Manning, Christopher D Combining joint models for biomedical event extraction
title	Combining joint models for biomedical event extraction
title_full	Combining joint models for biomedical event extraction
title_fullStr	Combining joint models for biomedical event extraction
title_full_unstemmed	Combining joint models for biomedical event extraction
title_short	Combining joint models for biomedical event extraction
title_sort	combining joint models for biomedical event extraction
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3395172/ https://www.ncbi.nlm.nih.gov/pubmed/22759463 http://dx.doi.org/10.1186/1471-2105-13-S11-S9
work_keys_str_mv	AT mccloskydavid combiningjointmodelsforbiomedicaleventextraction AT riedelsebastian combiningjointmodelsforbiomedicaleventextraction AT surdeanumihai combiningjointmodelsforbiomedicaleventextraction AT mccallumandrew combiningjointmodelsforbiomedicaleventextraction AT manningchristopherd combiningjointmodelsforbiomedicaleventextraction

Combining joint models for biomedical event extraction

Ejemplares similares