Cargando…
Semantically linking molecular entities in literature through entity relationships
BACKGROUND: Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molec...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384255/ https://www.ncbi.nlm.nih.gov/pubmed/22759460 http://dx.doi.org/10.1186/1471-2105-13-S11-S6 |
_version_ | 1782236684273844224 |
---|---|
author | Van Landeghem, Sofie Björne, Jari Abeel, Thomas De Baets, Bernard Salakoski, Tapio Van de Peer, Yves |
author_facet | Van Landeghem, Sofie Björne, Jari Abeel, Thomas De Baets, Bernard Salakoski, Tapio Van de Peer, Yves |
author_sort | Van Landeghem, Sofie |
collection | PubMed |
description | BACKGROUND: Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attribute to this goal by formally identifying the relations between genes, promoters, complexes and various other molecular entities found in text. More importantly, these studies help to enhance integration of text mining results with database facts. RESULTS: We describe, compare and evaluate two frameworks developed for the prediction of non-causal or 'entity' relations (REL) between gene symbols and domain terms. For the corresponding REL challenge of the BioNLP Shared Task of 2011, these systems ranked first (57.7% F-score) and second (41.6% F-score). In this paper, we investigate the performance discrepancy of 16 percentage points by benchmarking on a related and more extensive dataset, analysing the contribution of both the term detection and relation extraction modules. We further construct a hybrid system combining the two frameworks and experiment with intersection and union combinations, achieving respectively high-precision and high-recall results. Finally, we highlight extremely high-performance results (F-score >90%) obtained for the specific subclass of embedded entity relations that are essential for integrating text mining predictions with database facts. CONCLUSIONS: The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. The recent release of the EVEX dataset, containing biomolecular event predictions for millions of PubMed articles, is an interesting and exciting opportunity to overlay these entity relations with event predictions on a literature-wide scale. |
format | Online Article Text |
id | pubmed-3384255 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-33842552012-06-29 Semantically linking molecular entities in literature through entity relationships Van Landeghem, Sofie Björne, Jari Abeel, Thomas De Baets, Bernard Salakoski, Tapio Van de Peer, Yves BMC Bioinformatics Proceedings BACKGROUND: Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attribute to this goal by formally identifying the relations between genes, promoters, complexes and various other molecular entities found in text. More importantly, these studies help to enhance integration of text mining results with database facts. RESULTS: We describe, compare and evaluate two frameworks developed for the prediction of non-causal or 'entity' relations (REL) between gene symbols and domain terms. For the corresponding REL challenge of the BioNLP Shared Task of 2011, these systems ranked first (57.7% F-score) and second (41.6% F-score). In this paper, we investigate the performance discrepancy of 16 percentage points by benchmarking on a related and more extensive dataset, analysing the contribution of both the term detection and relation extraction modules. We further construct a hybrid system combining the two frameworks and experiment with intersection and union combinations, achieving respectively high-precision and high-recall results. Finally, we highlight extremely high-performance results (F-score >90%) obtained for the specific subclass of embedded entity relations that are essential for integrating text mining predictions with database facts. CONCLUSIONS: The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. The recent release of the EVEX dataset, containing biomolecular event predictions for millions of PubMed articles, is an interesting and exciting opportunity to overlay these entity relations with event predictions on a literature-wide scale. BioMed Central 2012-06-26 /pmc/articles/PMC3384255/ /pubmed/22759460 http://dx.doi.org/10.1186/1471-2105-13-S11-S6 Text en Copyright ©2012 Van Landeghem et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Van Landeghem, Sofie Björne, Jari Abeel, Thomas De Baets, Bernard Salakoski, Tapio Van de Peer, Yves Semantically linking molecular entities in literature through entity relationships |
title | Semantically linking molecular entities in literature through entity relationships |
title_full | Semantically linking molecular entities in literature through entity relationships |
title_fullStr | Semantically linking molecular entities in literature through entity relationships |
title_full_unstemmed | Semantically linking molecular entities in literature through entity relationships |
title_short | Semantically linking molecular entities in literature through entity relationships |
title_sort | semantically linking molecular entities in literature through entity relationships |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384255/ https://www.ncbi.nlm.nih.gov/pubmed/22759460 http://dx.doi.org/10.1186/1471-2105-13-S11-S6 |
work_keys_str_mv | AT vanlandeghemsofie semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships AT bjornejari semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships AT abeelthomas semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships AT debaetsbernard semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships AT salakoskitapio semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships AT vandepeeryves semanticallylinkingmolecularentitiesinliteraturethroughentityrelationships |