Cargando…

Disambiguation of biomedical text using diverse sources of information

BACKGROUND: Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of biomedical texts. Previous approaches to resolving this problem have made use of various sources of...

Descripción completa

Detalles Bibliográficos
Autores principales: Stevenson, Mark, Guo, Yikun, Gaizauskas, Robert, Martinez, David
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586756/
https://www.ncbi.nlm.nih.gov/pubmed/19025693
http://dx.doi.org/10.1186/1471-2105-9-S11-S7
_version_ 1782160909344440320
author Stevenson, Mark
Guo, Yikun
Gaizauskas, Robert
Martinez, David
author_facet Stevenson, Mark
Guo, Yikun
Gaizauskas, Robert
Martinez, David
author_sort Stevenson, Mark
collection PubMed
description BACKGROUND: Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of biomedical texts. Previous approaches to resolving this problem have made use of various sources of information including linguistic features of the context in which the ambiguous term is used and domain-specific resources, such as UMLS. MATERIALS AND METHODS: We compare various sources of information including ones which have been previously used and a novel one: MeSH terms. Evaluation is carried out using a standard test set (the NLM-WSD corpus). RESULTS: The best performance is obtained using a combination of linguistic features and MeSH terms. Performance of our system exceeds previously published results for systems evaluated using the same data set. CONCLUSION: Disambiguation of biomedical terms benefits from the use of information from a variety of sources. In particular, MeSH terms have proved to be useful and should be used if available.
format Text
id pubmed-2586756
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25867562008-11-26 Disambiguation of biomedical text using diverse sources of information Stevenson, Mark Guo, Yikun Gaizauskas, Robert Martinez, David BMC Bioinformatics Research BACKGROUND: Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of biomedical texts. Previous approaches to resolving this problem have made use of various sources of information including linguistic features of the context in which the ambiguous term is used and domain-specific resources, such as UMLS. MATERIALS AND METHODS: We compare various sources of information including ones which have been previously used and a novel one: MeSH terms. Evaluation is carried out using a standard test set (the NLM-WSD corpus). RESULTS: The best performance is obtained using a combination of linguistic features and MeSH terms. Performance of our system exceeds previously published results for systems evaluated using the same data set. CONCLUSION: Disambiguation of biomedical terms benefits from the use of information from a variety of sources. In particular, MeSH terms have proved to be useful and should be used if available. BioMed Central 2008-11-19 /pmc/articles/PMC2586756/ /pubmed/19025693 http://dx.doi.org/10.1186/1471-2105-9-S11-S7 Text en Copyright © 2008 Stevenson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Stevenson, Mark
Guo, Yikun
Gaizauskas, Robert
Martinez, David
Disambiguation of biomedical text using diverse sources of information
title Disambiguation of biomedical text using diverse sources of information
title_full Disambiguation of biomedical text using diverse sources of information
title_fullStr Disambiguation of biomedical text using diverse sources of information
title_full_unstemmed Disambiguation of biomedical text using diverse sources of information
title_short Disambiguation of biomedical text using diverse sources of information
title_sort disambiguation of biomedical text using diverse sources of information
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586756/
https://www.ncbi.nlm.nih.gov/pubmed/19025693
http://dx.doi.org/10.1186/1471-2105-9-S11-S7
work_keys_str_mv AT stevensonmark disambiguationofbiomedicaltextusingdiversesourcesofinformation
AT guoyikun disambiguationofbiomedicaltextusingdiversesourcesofinformation
AT gaizauskasrobert disambiguationofbiomedicaltextusingdiversesourcesofinformation
AT martinezdavid disambiguationofbiomedicaltextusingdiversesourcesofinformation