Cargando…

Semantically linking molecular entities in literature through entity relationships

BACKGROUND: Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molec...

Descripción completa

Detalles Bibliográficos
Autores principales: Van Landeghem, Sofie, Björne, Jari, Abeel, Thomas, De Baets, Bernard, Salakoski, Tapio, Van de Peer, Yves
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384255/
https://www.ncbi.nlm.nih.gov/pubmed/22759460
http://dx.doi.org/10.1186/1471-2105-13-S11-S6
_version_ 1782236684273844224
author Van Landeghem, Sofie
Björne, Jari
Abeel, Thomas
De Baets, Bernard
Salakoski, Tapio
Van de Peer, Yves
author_facet Van Landeghem, Sofie
Björne, Jari
Abeel, Thomas
De Baets, Bernard
Salakoski, Tapio
Van de Peer, Yves
author_sort Van Landeghem, Sofie
collection PubMed
description BACKGROUND: Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attribute to this goal by formally identifying the relations between genes, promoters, complexes and various other molecular entities found in text. More importantly, these studies help to enhance integration of text mining results with database facts. RESULTS: We describe, compare and evaluate two frameworks developed for the prediction of non-causal or 'entity' relations (REL) between gene symbols and domain terms. For the corresponding REL challenge of the BioNLP Shared Task of 2011, these systems ranked first (57.7% F-score) and second (41.6% F-score). In this paper, we investigate the performance discrepancy of 16 percentage points by benchmarking on a related and more extensive dataset, analysing the contribution of both the term detection and relation extraction modules. We further construct a hybrid system combining the two frameworks and experiment with intersection and union combinations, achieving respectively high-precision and high-recall results. Finally, we highlight extremely high-performance results (F-score >90%) obtained for the specific subclass of embedded entity relations that are essential for integrating text mining predictions with database facts. CONCLUSIONS: The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. The recent release of the EVEX dataset, containing biomolecular event predictions for millions of PubMed articles, is an interesting and exciting opportunity to overlay these entity relations with event predictions on a literature-wide scale.
format Online
Article
Text
id pubmed-3384255
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33842552012-06-29 Semantically linking molecular entities in literature through entity relationships Van Landeghem, Sofie Björne, Jari Abeel, Thomas De Baets, Bernard Salakoski, Tapio Van de Peer, Yves BMC Bioinformatics Proceedings BACKGROUND: Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attribute to this goal by formally identifying the relations between genes, promoters, complexes and various other molecular entities found in text. More importantly, these studies help to enhance integration of text mining results with database facts. RESULTS: We describe, compare and evaluate two frameworks developed for the prediction of non-causal or 'entity' relations (REL) between gene symbols and domain terms. For the corresponding REL challenge of the BioNLP Shared Task of 2011, these systems ranked first (57.7% F-score) and second (41.6% F-score). In this paper, we investigate the performance discrepancy of 16 percentage points by benchmarking on a related and more extensive dataset, analysing the contribution of both the term detection and relation extraction modules. We further construct a hybrid system combining the two frameworks and experiment with intersection and union combinations, achieving respectively high-precision and high-recall results. Finally, we highlight extremely high-performance results (F-score >90%) obtained for the specific subclass of embedded entity relations that are essential for integrating text mining predictions with database facts. CONCLUSIONS: The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. The recent release of the EVEX dataset, containing biomolecular event predictions for millions of PubMed articles, is an interesting and exciting opportunity to overlay these entity relations with event predictions on a literature-wide scale. BioMed Central 2012-06-26 /pmc/articles/PMC3384255/ /pubmed/22759460 http://dx.doi.org/10.1186/1471-2105-13-S11-S6 Text en Copyright ©2012 Van Landeghem et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Van Landeghem, Sofie
Björne, Jari
Abeel, Thomas
De Baets, Bernard
Salakoski, Tapio
Van de Peer, Yves
Semantically linking molecular entities in literature through entity relationships
title Semantically linking molecular entities in literature through entity relationships
title_full Semantically linking molecular entities in literature through entity relationships
title_fullStr Semantically linking molecular entities in literature through entity relationships
title_full_unstemmed Semantically linking molecular entities in literature through entity relationships
title_short Semantically linking molecular entities in literature through entity relationships
title_sort semantically linking molecular entities in literature through entity relationships
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384255/
https://www.ncbi.nlm.nih.gov/pubmed/22759460
http://dx.doi.org/10.1186/1471-2105-13-S11-S6
work_keys_str_mv AT vanlandeghemsofie semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships
AT bjornejari semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships
AT abeelthomas semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships
AT debaetsbernard semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships
AT salakoskitapio semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships
AT vandepeeryves semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships