Cargando…
The effect of word sense disambiguation accuracy on literature based discovery
BACKGROUND: The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Literature based discovery (LBD) attempts to address this problem by searching for previously unnoticed co...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959388/ https://www.ncbi.nlm.nih.gov/pubmed/27455071 http://dx.doi.org/10.1186/s12911-016-0296-1 |
_version_ | 1782444396811124736 |
---|---|
author | Preiss, Judita Stevenson, Mark |
author_facet | Preiss, Judita Stevenson, Mark |
author_sort | Preiss, Judita |
collection | PubMed |
description | BACKGROUND: The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Literature based discovery (LBD) attempts to address this problem by searching for previously unnoticed connections between published information (also known as “hidden knowledge”). A common approach is to identify hidden knowledge via shared linking terms. However, biomedical documents are highly ambiguous which can lead LBD systems to over generate hidden knowledge by hypothesising connections through different meanings of linking terms. Word Sense Disambiguation (WSD) aims to resolve ambiguities in text by identifying the meaning of ambiguous terms. This study explores the effect of WSD accuracy on LBD performance. METHODS: An existing LBD system is employed and four approaches to WSD of biomedical documents integrated with it. The accuracy of each WSD approach is determined by comparing its output against a standard benchmark. Evaluation of the LBD output is carried out using timeslicing approach, where hidden knowledge is generated from articles published prior to a certain cutoff date and a gold standard extracted from publications after the cutoff date. RESULTS: WSD accuracy varies depending on the approach used. The connection between the performance of the LBD and WSD systems are analysed to reveal a correlation between WSD accuracy and LBD performance. CONCLUSION: This study reveals that LBD performance is sensitive to WSD accuracy. It is therefore concluded that WSD has the potential to improve the output of LBD systems by reducing the amount of spurious hidden knowledge that is generated. It is also suggested that further improvements in WSD accuracy have the potential to improve LBD accuracy. |
format | Online Article Text |
id | pubmed-4959388 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-49593882016-08-01 The effect of word sense disambiguation accuracy on literature based discovery Preiss, Judita Stevenson, Mark BMC Med Inform Decis Mak Research BACKGROUND: The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Literature based discovery (LBD) attempts to address this problem by searching for previously unnoticed connections between published information (also known as “hidden knowledge”). A common approach is to identify hidden knowledge via shared linking terms. However, biomedical documents are highly ambiguous which can lead LBD systems to over generate hidden knowledge by hypothesising connections through different meanings of linking terms. Word Sense Disambiguation (WSD) aims to resolve ambiguities in text by identifying the meaning of ambiguous terms. This study explores the effect of WSD accuracy on LBD performance. METHODS: An existing LBD system is employed and four approaches to WSD of biomedical documents integrated with it. The accuracy of each WSD approach is determined by comparing its output against a standard benchmark. Evaluation of the LBD output is carried out using timeslicing approach, where hidden knowledge is generated from articles published prior to a certain cutoff date and a gold standard extracted from publications after the cutoff date. RESULTS: WSD accuracy varies depending on the approach used. The connection between the performance of the LBD and WSD systems are analysed to reveal a correlation between WSD accuracy and LBD performance. CONCLUSION: This study reveals that LBD performance is sensitive to WSD accuracy. It is therefore concluded that WSD has the potential to improve the output of LBD systems by reducing the amount of spurious hidden knowledge that is generated. It is also suggested that further improvements in WSD accuracy have the potential to improve LBD accuracy. BioMed Central 2016-07-18 /pmc/articles/PMC4959388/ /pubmed/27455071 http://dx.doi.org/10.1186/s12911-016-0296-1 Text en © Preiss and Stevenson. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Preiss, Judita Stevenson, Mark The effect of word sense disambiguation accuracy on literature based discovery |
title | The effect of word sense disambiguation accuracy on literature based discovery |
title_full | The effect of word sense disambiguation accuracy on literature based discovery |
title_fullStr | The effect of word sense disambiguation accuracy on literature based discovery |
title_full_unstemmed | The effect of word sense disambiguation accuracy on literature based discovery |
title_short | The effect of word sense disambiguation accuracy on literature based discovery |
title_sort | effect of word sense disambiguation accuracy on literature based discovery |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959388/ https://www.ncbi.nlm.nih.gov/pubmed/27455071 http://dx.doi.org/10.1186/s12911-016-0296-1 |
work_keys_str_mv | AT preissjudita theeffectofwordsensedisambiguationaccuracyonliteraturebaseddiscovery AT stevensonmark theeffectofwordsensedisambiguationaccuracyonliteraturebaseddiscovery AT preissjudita effectofwordsensedisambiguationaccuracyonliteraturebaseddiscovery AT stevensonmark effectofwordsensedisambiguationaccuracyonliteraturebaseddiscovery |