Cargando…

A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism

Biomedical event extraction is an important and difficult task in bioinformatics. With the rapid growth of biomedical literature, the extraction of complex events from unstructured text has attracted more attention. However, the annotated biomedical corpus is highly imbalanced, which affects the per...

Descripción completa

Detalles Bibliográficos
Autores principales: Lu, Yang, Ma, Xiaolei, Lu, Yinan, Zhou, Yuxin, Pei, Zhili
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5206857/
https://www.ncbi.nlm.nih.gov/pubmed/28096894
http://dx.doi.org/10.1155/2016/7536494
_version_ 1782490309044731904
author Lu, Yang
Ma, Xiaolei
Lu, Yinan
Zhou, Yuxin
Pei, Zhili
author_facet Lu, Yang
Ma, Xiaolei
Lu, Yinan
Zhou, Yuxin
Pei, Zhili
author_sort Lu, Yang
collection PubMed
description Biomedical event extraction is an important and difficult task in bioinformatics. With the rapid growth of biomedical literature, the extraction of complex events from unstructured text has attracted more attention. However, the annotated biomedical corpus is highly imbalanced, which affects the performance of the classification algorithms. In this study, a sample selection algorithm based on sequential pattern is proposed to filter negative samples in the training phase. Considering the joint information between the trigger and argument of multiargument events, we extract triplets of multiargument events directly using a support vector machine classifier. A joint scoring mechanism, which is based on sentence similarity and importance of trigger in the training data, is used to correct the predicted results. Experimental results indicate that the proposed method can extract events efficiently.
format Online
Article
Text
id pubmed-5206857
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-52068572017-01-17 A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism Lu, Yang Ma, Xiaolei Lu, Yinan Zhou, Yuxin Pei, Zhili Comput Math Methods Med Research Article Biomedical event extraction is an important and difficult task in bioinformatics. With the rapid growth of biomedical literature, the extraction of complex events from unstructured text has attracted more attention. However, the annotated biomedical corpus is highly imbalanced, which affects the performance of the classification algorithms. In this study, a sample selection algorithm based on sequential pattern is proposed to filter negative samples in the training phase. Considering the joint information between the trigger and argument of multiargument events, we extract triplets of multiargument events directly using a support vector machine classifier. A joint scoring mechanism, which is based on sentence similarity and importance of trigger in the training data, is used to correct the predicted results. Experimental results indicate that the proposed method can extract events efficiently. Hindawi Publishing Corporation 2016 2016-12-14 /pmc/articles/PMC5206857/ /pubmed/28096894 http://dx.doi.org/10.1155/2016/7536494 Text en Copyright © 2016 Yang Lu et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Lu, Yang
Ma, Xiaolei
Lu, Yinan
Zhou, Yuxin
Pei, Zhili
A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism
title A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism
title_full A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism
title_fullStr A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism
title_full_unstemmed A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism
title_short A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism
title_sort novel sample selection strategy for imbalanced data of biomedical event extraction with joint scoring mechanism
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5206857/
https://www.ncbi.nlm.nih.gov/pubmed/28096894
http://dx.doi.org/10.1155/2016/7536494
work_keys_str_mv AT luyang anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT maxiaolei anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT luyinan anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT zhouyuxin anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT peizhili anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT luyang novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT maxiaolei novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT luyinan novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT zhouyuxin novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism
AT peizhili novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism