Cargando…

Complex event extraction at PubMed scale

Motivation: There has recently been a notable shift in biomedical information extraction (IE) from relation models toward the more expressive event model, facilitated by the maturation of basic tools for biomedical text analysis and the availability of manually annotated resources. The event model a...

Descripción completa

Detalles Bibliográficos
Autores principales: Björne, Jari, Ginter, Filip, Pyysalo, Sampo, Tsujii, Jun'ichi, Salakoski, Tapio
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881365/
https://www.ncbi.nlm.nih.gov/pubmed/20529932
http://dx.doi.org/10.1093/bioinformatics/btq180
_version_ 1782182107258290176
author Björne, Jari
Ginter, Filip
Pyysalo, Sampo
Tsujii, Jun'ichi
Salakoski, Tapio
author_facet Björne, Jari
Ginter, Filip
Pyysalo, Sampo
Tsujii, Jun'ichi
Salakoski, Tapio
author_sort Björne, Jari
collection PubMed
description Motivation: There has recently been a notable shift in biomedical information extraction (IE) from relation models toward the more expressive event model, facilitated by the maturation of basic tools for biomedical text analysis and the availability of manually annotated resources. The event model allows detailed representation of complex natural language statements and can support a number of advanced text mining applications ranging from semantic search to pathway extraction. A recent collaborative evaluation demonstrated the potential of event extraction systems, yet there have so far been no studies of the generalization ability of the systems nor the feasibility of large-scale extraction. Results: This study considers event-based IE at PubMed scale. We introduce a system combining publicly available, state-of-the-art methods for domain parsing, named entity recognition and event extraction, and test the system on a representative 1% sample of all PubMed citations. We present the first evaluation of the generalization performance of event extraction systems to this scale and show that despite its computational complexity, event extraction from the entire PubMed is feasible. We further illustrate the value of the extraction approach through a number of analyses of the extracted information. Availability: The event detection system and extracted data are open source licensed and available at http://bionlp.utu.fi/. Contact: jari.bjorne@utu.fi
format Text
id pubmed-2881365
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28813652010-06-08 Complex event extraction at PubMed scale Björne, Jari Ginter, Filip Pyysalo, Sampo Tsujii, Jun'ichi Salakoski, Tapio Bioinformatics Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Motivation: There has recently been a notable shift in biomedical information extraction (IE) from relation models toward the more expressive event model, facilitated by the maturation of basic tools for biomedical text analysis and the availability of manually annotated resources. The event model allows detailed representation of complex natural language statements and can support a number of advanced text mining applications ranging from semantic search to pathway extraction. A recent collaborative evaluation demonstrated the potential of event extraction systems, yet there have so far been no studies of the generalization ability of the systems nor the feasibility of large-scale extraction. Results: This study considers event-based IE at PubMed scale. We introduce a system combining publicly available, state-of-the-art methods for domain parsing, named entity recognition and event extraction, and test the system on a representative 1% sample of all PubMed citations. We present the first evaluation of the generalization performance of event extraction systems to this scale and show that despite its computational complexity, event extraction from the entire PubMed is feasible. We further illustrate the value of the extraction approach through a number of analyses of the extracted information. Availability: The event detection system and extracted data are open source licensed and available at http://bionlp.utu.fi/. Contact: jari.bjorne@utu.fi Oxford University Press 2010-06-15 2010-06-01 /pmc/articles/PMC2881365/ /pubmed/20529932 http://dx.doi.org/10.1093/bioinformatics/btq180 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
Björne, Jari
Ginter, Filip
Pyysalo, Sampo
Tsujii, Jun'ichi
Salakoski, Tapio
Complex event extraction at PubMed scale
title Complex event extraction at PubMed scale
title_full Complex event extraction at PubMed scale
title_fullStr Complex event extraction at PubMed scale
title_full_unstemmed Complex event extraction at PubMed scale
title_short Complex event extraction at PubMed scale
title_sort complex event extraction at pubmed scale
topic Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881365/
https://www.ncbi.nlm.nih.gov/pubmed/20529932
http://dx.doi.org/10.1093/bioinformatics/btq180
work_keys_str_mv AT bjornejari complexeventextractionatpubmedscale
AT ginterfilip complexeventextractionatpubmedscale
AT pyysalosampo complexeventextractionatpubmedscale
AT tsujiijunichi complexeventextractionatpubmedscale
AT salakoskitapio complexeventextractionatpubmedscale