Cargando…

Extracting biomedical events from pairs of text entities

BACKGROUND: Huge amounts of electronic biomedical documents, such as molecular biology reports or genomic papers are generated daily. Nowadays, these documents are mainly available in the form of unstructured free texts, which require heavy processing for their registration into organized databases....

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Xiao, Bordes, Antoine, Grandvalet, Yves
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4511465/
https://www.ncbi.nlm.nih.gov/pubmed/26201478
http://dx.doi.org/10.1186/1471-2105-16-S10-S8
_version_ 1782382340705615872
author Liu, Xiao
Bordes, Antoine
Grandvalet, Yves
author_facet Liu, Xiao
Bordes, Antoine
Grandvalet, Yves
author_sort Liu, Xiao
collection PubMed
description BACKGROUND: Huge amounts of electronic biomedical documents, such as molecular biology reports or genomic papers are generated daily. Nowadays, these documents are mainly available in the form of unstructured free texts, which require heavy processing for their registration into organized databases. This organization is instrumental for information retrieval, enabling to answer the advanced queries of researchers and practitioners in biology, medicine, and related fields. Hence, the massive data flow calls for efficient automatic methods of text-mining that extract high-level information, such as biomedical events, from biomedical text. The usual computational tools of Natural Language Processing cannot be readily applied to extract these biomedical events, due to the peculiarities of the domain. Indeed, biomedical documents contain highly domain-specific jargon and syntax. These documents also describe distinctive dependencies, making text-mining in molecular biology a specific discipline. RESULTS: We address biomedical event extraction as the classification of pairs of text entities into the classes corresponding to event types. The candidate pairs of text entities are recursively provided to a multiclass classifier relying on Support Vector Machines. This recursive process extracts events involving other events as arguments. Compared to joint models based on Markov Random Fields, our model simplifies inference and hence requires shorter training and prediction times along with lower memory capacity. Compared to usual pipeline approaches, our model passes over a complex intermediate problem, while making a more extensive usage of sophisticated joint features between text entities. Our method focuses on the core event extraction of the Genia task of BioNLP challenges yielding the best result reported so far on the 2013 edition.
format Online
Article
Text
id pubmed-4511465
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45114652015-07-28 Extracting biomedical events from pairs of text entities Liu, Xiao Bordes, Antoine Grandvalet, Yves BMC Bioinformatics Research BACKGROUND: Huge amounts of electronic biomedical documents, such as molecular biology reports or genomic papers are generated daily. Nowadays, these documents are mainly available in the form of unstructured free texts, which require heavy processing for their registration into organized databases. This organization is instrumental for information retrieval, enabling to answer the advanced queries of researchers and practitioners in biology, medicine, and related fields. Hence, the massive data flow calls for efficient automatic methods of text-mining that extract high-level information, such as biomedical events, from biomedical text. The usual computational tools of Natural Language Processing cannot be readily applied to extract these biomedical events, due to the peculiarities of the domain. Indeed, biomedical documents contain highly domain-specific jargon and syntax. These documents also describe distinctive dependencies, making text-mining in molecular biology a specific discipline. RESULTS: We address biomedical event extraction as the classification of pairs of text entities into the classes corresponding to event types. The candidate pairs of text entities are recursively provided to a multiclass classifier relying on Support Vector Machines. This recursive process extracts events involving other events as arguments. Compared to joint models based on Markov Random Fields, our model simplifies inference and hence requires shorter training and prediction times along with lower memory capacity. Compared to usual pipeline approaches, our model passes over a complex intermediate problem, while making a more extensive usage of sophisticated joint features between text entities. Our method focuses on the core event extraction of the Genia task of BioNLP challenges yielding the best result reported so far on the 2013 edition. BioMed Central 2015-07-13 /pmc/articles/PMC4511465/ /pubmed/26201478 http://dx.doi.org/10.1186/1471-2105-16-S10-S8 Text en Copyright © 2015 Liu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Liu, Xiao
Bordes, Antoine
Grandvalet, Yves
Extracting biomedical events from pairs of text entities
title Extracting biomedical events from pairs of text entities
title_full Extracting biomedical events from pairs of text entities
title_fullStr Extracting biomedical events from pairs of text entities
title_full_unstemmed Extracting biomedical events from pairs of text entities
title_short Extracting biomedical events from pairs of text entities
title_sort extracting biomedical events from pairs of text entities
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4511465/
https://www.ncbi.nlm.nih.gov/pubmed/26201478
http://dx.doi.org/10.1186/1471-2105-16-S10-S8
work_keys_str_mv AT liuxiao extractingbiomedicaleventsfrompairsoftextentities
AT bordesantoine extractingbiomedicaleventsfrompairsoftextentities
AT grandvaletyves extractingbiomedicaleventsfrompairsoftextentities