Cargando…

Distinguishing the species of biomedical named entities for term identification

BACKGROUND: Term identification is the task of grounding ambiguous mentions of biomedical named entities in text to unique database identifiers. Previous work on term identification has focused on studying species-specific documents. However, full-length articles often describe entities across a num...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xinglong, Matthews, Michael
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586755/
https://www.ncbi.nlm.nih.gov/pubmed/19025692
http://dx.doi.org/10.1186/1471-2105-9-S11-S6
_version_ 1782160909103267840
author Wang, Xinglong
Matthews, Michael
author_facet Wang, Xinglong
Matthews, Michael
author_sort Wang, Xinglong
collection PubMed
description BACKGROUND: Term identification is the task of grounding ambiguous mentions of biomedical named entities in text to unique database identifiers. Previous work on term identification has focused on studying species-specific documents. However, full-length articles often describe entities across a number of species, in which case resolving the ambiguity of model organisms in entities is critical to achieving accurate term identification. RESULTS: We developed and compared a number of rule-based and machine-learning based approaches to resolving species ambiguity in mentions of biomedical named entities, and demonstrated that a hybrid method achieved the best overall accuracy at 71.7%, as tested on the gold-standard ITI-TXM corpora. By utilising the species information predicted by the hybrid tagger, our rule-based term identification system was improved significantly by up to 11.6%. CONCLUSION: This paper shows that, in the context of identifying terms involving multiple model organisms, integration of an accurate species disambiguation system can significantly improve the performance of term identification systems.
format Text
id pubmed-2586755
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25867552008-11-26 Distinguishing the species of biomedical named entities for term identification Wang, Xinglong Matthews, Michael BMC Bioinformatics Research BACKGROUND: Term identification is the task of grounding ambiguous mentions of biomedical named entities in text to unique database identifiers. Previous work on term identification has focused on studying species-specific documents. However, full-length articles often describe entities across a number of species, in which case resolving the ambiguity of model organisms in entities is critical to achieving accurate term identification. RESULTS: We developed and compared a number of rule-based and machine-learning based approaches to resolving species ambiguity in mentions of biomedical named entities, and demonstrated that a hybrid method achieved the best overall accuracy at 71.7%, as tested on the gold-standard ITI-TXM corpora. By utilising the species information predicted by the hybrid tagger, our rule-based term identification system was improved significantly by up to 11.6%. CONCLUSION: This paper shows that, in the context of identifying terms involving multiple model organisms, integration of an accurate species disambiguation system can significantly improve the performance of term identification systems. BioMed Central 2008-11-19 /pmc/articles/PMC2586755/ /pubmed/19025692 http://dx.doi.org/10.1186/1471-2105-9-S11-S6 Text en Copyright © 2008 Wang and Matthews; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Wang, Xinglong
Matthews, Michael
Distinguishing the species of biomedical named entities for term identification
title Distinguishing the species of biomedical named entities for term identification
title_full Distinguishing the species of biomedical named entities for term identification
title_fullStr Distinguishing the species of biomedical named entities for term identification
title_full_unstemmed Distinguishing the species of biomedical named entities for term identification
title_short Distinguishing the species of biomedical named entities for term identification
title_sort distinguishing the species of biomedical named entities for term identification
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586755/
https://www.ncbi.nlm.nih.gov/pubmed/19025692
http://dx.doi.org/10.1186/1471-2105-9-S11-S6
work_keys_str_mv AT wangxinglong distinguishingthespeciesofbiomedicalnamedentitiesfortermidentification
AT matthewsmichael distinguishingthespeciesofbiomedicalnamedentitiesfortermidentification