Cargando…

Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space

BACKGROUND: In biomedical information extraction, event extraction plays a crucial role. Biological events are used to describe the dynamic effects or relationships between biological entities such as proteins and genes. Event extraction is generally divided into trigger detection and argument recog...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Yan, Wang, Jian, Lin, Hongfei, Tang, Xiwei, Zhang, Shaowu, Li, Lishuang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6302454/ https://www.ncbi.nlm.nih.gov/pubmed/30577839 http://dx.doi.org/10.1186/s12859-018-2543-1

_version_	1783381983042207744
author	Wang, Yan Wang, Jian Lin, Hongfei Tang, Xiwei Zhang, Shaowu Li, Lishuang
author_facet	Wang, Yan Wang, Jian Lin, Hongfei Tang, Xiwei Zhang, Shaowu Li, Lishuang
author_sort	Wang, Yan
collection	PubMed
description	BACKGROUND: In biomedical information extraction, event extraction plays a crucial role. Biological events are used to describe the dynamic effects or relationships between biological entities such as proteins and genes. Event extraction is generally divided into trigger detection and argument recognition. The performance of trigger detection directly affects the results of the event extraction. In general, the traditional method is used to address the trigger detection as a classification task, as well as the use of machine learning or rules method, which construct many features to improve the classification results. Moreover, the classification model only recognizes triggers composed of single words, whereas for multiple words, the result is unsatisfactory. RESULTS: The corpus of our model is MLEE. If we were to only use the biomedical LSTM and CRF model without other features, the F-score would reach about 78.08%. Comparing entity to part of speech (POS), we find the entity features more conducive to the improvement of performance of detection, with the F-score potentially reaching about 80%. Furthermore, we also experiment on the other three corpora (BioNLP 2009, BioNLP 2011, and BioNLP 2013) to verify the generalization of our model. Hence, F-scores can reach more than 60%, which are better than the comparative experiments. CONCLUSIONS: The trigger recognition method based on the sequence annotation model does not require initial complex feature engineering, and only requires a simple labeling mechanism to complete the training. Therefore, generalization of our model is better compared to other traditional models. Secondly, this method can identify multi-word triggers, thereby improving the F-scores of trigger recognition. Thirdly, details on the entity have a crucial impact on trigger detection. Finally, the combination of character-level word embedding and word-level word embedding provides increasingly effective information for the model; therefore, it is a key to the success of the experiment.
format	Online Article Text
id	pubmed-6302454
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-63024542018-12-31 Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space Wang, Yan Wang, Jian Lin, Hongfei Tang, Xiwei Zhang, Shaowu Li, Lishuang BMC Bioinformatics Research BACKGROUND: In biomedical information extraction, event extraction plays a crucial role. Biological events are used to describe the dynamic effects or relationships between biological entities such as proteins and genes. Event extraction is generally divided into trigger detection and argument recognition. The performance of trigger detection directly affects the results of the event extraction. In general, the traditional method is used to address the trigger detection as a classification task, as well as the use of machine learning or rules method, which construct many features to improve the classification results. Moreover, the classification model only recognizes triggers composed of single words, whereas for multiple words, the result is unsatisfactory. RESULTS: The corpus of our model is MLEE. If we were to only use the biomedical LSTM and CRF model without other features, the F-score would reach about 78.08%. Comparing entity to part of speech (POS), we find the entity features more conducive to the improvement of performance of detection, with the F-score potentially reaching about 80%. Furthermore, we also experiment on the other three corpora (BioNLP 2009, BioNLP 2011, and BioNLP 2013) to verify the generalization of our model. Hence, F-scores can reach more than 60%, which are better than the comparative experiments. CONCLUSIONS: The trigger recognition method based on the sequence annotation model does not require initial complex feature engineering, and only requires a simple labeling mechanism to complete the training. Therefore, generalization of our model is better compared to other traditional models. Secondly, this method can identify multi-word triggers, thereby improving the F-scores of trigger recognition. Thirdly, details on the entity have a crucial impact on trigger detection. Finally, the combination of character-level word embedding and word-level word embedding provides increasingly effective information for the model; therefore, it is a key to the success of the experiment. BioMed Central 2018-12-21 /pmc/articles/PMC6302454/ /pubmed/30577839 http://dx.doi.org/10.1186/s12859-018-2543-1 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Wang, Yan Wang, Jian Lin, Hongfei Tang, Xiwei Zhang, Shaowu Li, Lishuang Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space
title	Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space
title_full	Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space
title_fullStr	Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space
title_full_unstemmed	Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space
title_short	Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space
title_sort	bidirectional long short-term memory with crf for detecting biomedical event trigger in fasttext semantic space
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6302454/ https://www.ncbi.nlm.nih.gov/pubmed/30577839 http://dx.doi.org/10.1186/s12859-018-2543-1
work_keys_str_mv	AT wangyan bidirectionallongshorttermmemorywithcrffordetectingbiomedicaleventtriggerinfasttextsemanticspace AT wangjian bidirectionallongshorttermmemorywithcrffordetectingbiomedicaleventtriggerinfasttextsemanticspace AT linhongfei bidirectionallongshorttermmemorywithcrffordetectingbiomedicaleventtriggerinfasttextsemanticspace AT tangxiwei bidirectionallongshorttermmemorywithcrffordetectingbiomedicaleventtriggerinfasttextsemanticspace AT zhangshaowu bidirectionallongshorttermmemorywithcrffordetectingbiomedicaleventtriggerinfasttextsemanticspace AT lilishuang bidirectionallongshorttermmemorywithcrffordetectingbiomedicaleventtriggerinfasttextsemanticspace

Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space

Ejemplares similares