Cargando…

The effect of word sense disambiguation accuracy on literature based discovery

BACKGROUND: The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Literature based discovery (LBD) attempts to address this problem by searching for previously unnoticed co...

Descripción completa

Detalles Bibliográficos
Autores principales: Preiss, Judita, Stevenson, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959388/
https://www.ncbi.nlm.nih.gov/pubmed/27455071
http://dx.doi.org/10.1186/s12911-016-0296-1
_version_ 1782444396811124736
author Preiss, Judita
Stevenson, Mark
author_facet Preiss, Judita
Stevenson, Mark
author_sort Preiss, Judita
collection PubMed
description BACKGROUND: The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Literature based discovery (LBD) attempts to address this problem by searching for previously unnoticed connections between published information (also known as “hidden knowledge”). A common approach is to identify hidden knowledge via shared linking terms. However, biomedical documents are highly ambiguous which can lead LBD systems to over generate hidden knowledge by hypothesising connections through different meanings of linking terms. Word Sense Disambiguation (WSD) aims to resolve ambiguities in text by identifying the meaning of ambiguous terms. This study explores the effect of WSD accuracy on LBD performance. METHODS: An existing LBD system is employed and four approaches to WSD of biomedical documents integrated with it. The accuracy of each WSD approach is determined by comparing its output against a standard benchmark. Evaluation of the LBD output is carried out using timeslicing approach, where hidden knowledge is generated from articles published prior to a certain cutoff date and a gold standard extracted from publications after the cutoff date. RESULTS: WSD accuracy varies depending on the approach used. The connection between the performance of the LBD and WSD systems are analysed to reveal a correlation between WSD accuracy and LBD performance. CONCLUSION: This study reveals that LBD performance is sensitive to WSD accuracy. It is therefore concluded that WSD has the potential to improve the output of LBD systems by reducing the amount of spurious hidden knowledge that is generated. It is also suggested that further improvements in WSD accuracy have the potential to improve LBD accuracy.
format Online
Article
Text
id pubmed-4959388
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49593882016-08-01 The effect of word sense disambiguation accuracy on literature based discovery Preiss, Judita Stevenson, Mark BMC Med Inform Decis Mak Research BACKGROUND: The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Literature based discovery (LBD) attempts to address this problem by searching for previously unnoticed connections between published information (also known as “hidden knowledge”). A common approach is to identify hidden knowledge via shared linking terms. However, biomedical documents are highly ambiguous which can lead LBD systems to over generate hidden knowledge by hypothesising connections through different meanings of linking terms. Word Sense Disambiguation (WSD) aims to resolve ambiguities in text by identifying the meaning of ambiguous terms. This study explores the effect of WSD accuracy on LBD performance. METHODS: An existing LBD system is employed and four approaches to WSD of biomedical documents integrated with it. The accuracy of each WSD approach is determined by comparing its output against a standard benchmark. Evaluation of the LBD output is carried out using timeslicing approach, where hidden knowledge is generated from articles published prior to a certain cutoff date and a gold standard extracted from publications after the cutoff date. RESULTS: WSD accuracy varies depending on the approach used. The connection between the performance of the LBD and WSD systems are analysed to reveal a correlation between WSD accuracy and LBD performance. CONCLUSION: This study reveals that LBD performance is sensitive to WSD accuracy. It is therefore concluded that WSD has the potential to improve the output of LBD systems by reducing the amount of spurious hidden knowledge that is generated. It is also suggested that further improvements in WSD accuracy have the potential to improve LBD accuracy. BioMed Central 2016-07-18 /pmc/articles/PMC4959388/ /pubmed/27455071 http://dx.doi.org/10.1186/s12911-016-0296-1 Text en © Preiss and Stevenson. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Preiss, Judita
Stevenson, Mark
The effect of word sense disambiguation accuracy on literature based discovery
title The effect of word sense disambiguation accuracy on literature based discovery
title_full The effect of word sense disambiguation accuracy on literature based discovery
title_fullStr The effect of word sense disambiguation accuracy on literature based discovery
title_full_unstemmed The effect of word sense disambiguation accuracy on literature based discovery
title_short The effect of word sense disambiguation accuracy on literature based discovery
title_sort effect of word sense disambiguation accuracy on literature based discovery
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959388/
https://www.ncbi.nlm.nih.gov/pubmed/27455071
http://dx.doi.org/10.1186/s12911-016-0296-1
work_keys_str_mv AT preissjudita theeffectofwordsensedisambiguationaccuracyonliteraturebaseddiscovery
AT stevensonmark theeffectofwordsensedisambiguationaccuracyonliteraturebaseddiscovery
AT preissjudita effectofwordsensedisambiguationaccuracyonliteraturebaseddiscovery
AT stevensonmark effectofwordsensedisambiguationaccuracyonliteraturebaseddiscovery