Cargando…
A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism
Biomedical event extraction is an important and difficult task in bioinformatics. With the rapid growth of biomedical literature, the extraction of complex events from unstructured text has attracted more attention. However, the annotated biomedical corpus is highly imbalanced, which affects the per...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5206857/ https://www.ncbi.nlm.nih.gov/pubmed/28096894 http://dx.doi.org/10.1155/2016/7536494 |
_version_ | 1782490309044731904 |
---|---|
author | Lu, Yang Ma, Xiaolei Lu, Yinan Zhou, Yuxin Pei, Zhili |
author_facet | Lu, Yang Ma, Xiaolei Lu, Yinan Zhou, Yuxin Pei, Zhili |
author_sort | Lu, Yang |
collection | PubMed |
description | Biomedical event extraction is an important and difficult task in bioinformatics. With the rapid growth of biomedical literature, the extraction of complex events from unstructured text has attracted more attention. However, the annotated biomedical corpus is highly imbalanced, which affects the performance of the classification algorithms. In this study, a sample selection algorithm based on sequential pattern is proposed to filter negative samples in the training phase. Considering the joint information between the trigger and argument of multiargument events, we extract triplets of multiargument events directly using a support vector machine classifier. A joint scoring mechanism, which is based on sentence similarity and importance of trigger in the training data, is used to correct the predicted results. Experimental results indicate that the proposed method can extract events efficiently. |
format | Online Article Text |
id | pubmed-5206857 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-52068572017-01-17 A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism Lu, Yang Ma, Xiaolei Lu, Yinan Zhou, Yuxin Pei, Zhili Comput Math Methods Med Research Article Biomedical event extraction is an important and difficult task in bioinformatics. With the rapid growth of biomedical literature, the extraction of complex events from unstructured text has attracted more attention. However, the annotated biomedical corpus is highly imbalanced, which affects the performance of the classification algorithms. In this study, a sample selection algorithm based on sequential pattern is proposed to filter negative samples in the training phase. Considering the joint information between the trigger and argument of multiargument events, we extract triplets of multiargument events directly using a support vector machine classifier. A joint scoring mechanism, which is based on sentence similarity and importance of trigger in the training data, is used to correct the predicted results. Experimental results indicate that the proposed method can extract events efficiently. Hindawi Publishing Corporation 2016 2016-12-14 /pmc/articles/PMC5206857/ /pubmed/28096894 http://dx.doi.org/10.1155/2016/7536494 Text en Copyright © 2016 Yang Lu et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Lu, Yang Ma, Xiaolei Lu, Yinan Zhou, Yuxin Pei, Zhili A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism |
title | A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism |
title_full | A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism |
title_fullStr | A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism |
title_full_unstemmed | A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism |
title_short | A Novel Sample Selection Strategy for Imbalanced Data of Biomedical Event Extraction with Joint Scoring Mechanism |
title_sort | novel sample selection strategy for imbalanced data of biomedical event extraction with joint scoring mechanism |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5206857/ https://www.ncbi.nlm.nih.gov/pubmed/28096894 http://dx.doi.org/10.1155/2016/7536494 |
work_keys_str_mv | AT luyang anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT maxiaolei anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT luyinan anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT zhouyuxin anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT peizhili anovelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT luyang novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT maxiaolei novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT luyinan novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT zhouyuxin novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism AT peizhili novelsampleselectionstrategyforimbalanceddataofbiomedicaleventextractionwithjointscoringmechanism |