Cargando…
Distinguishing the species of biomedical named entities for term identification
BACKGROUND: Term identification is the task of grounding ambiguous mentions of biomedical named entities in text to unique database identifiers. Previous work on term identification has focused on studying species-specific documents. However, full-length articles often describe entities across a num...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586755/ https://www.ncbi.nlm.nih.gov/pubmed/19025692 http://dx.doi.org/10.1186/1471-2105-9-S11-S6 |
_version_ | 1782160909103267840 |
---|---|
author | Wang, Xinglong Matthews, Michael |
author_facet | Wang, Xinglong Matthews, Michael |
author_sort | Wang, Xinglong |
collection | PubMed |
description | BACKGROUND: Term identification is the task of grounding ambiguous mentions of biomedical named entities in text to unique database identifiers. Previous work on term identification has focused on studying species-specific documents. However, full-length articles often describe entities across a number of species, in which case resolving the ambiguity of model organisms in entities is critical to achieving accurate term identification. RESULTS: We developed and compared a number of rule-based and machine-learning based approaches to resolving species ambiguity in mentions of biomedical named entities, and demonstrated that a hybrid method achieved the best overall accuracy at 71.7%, as tested on the gold-standard ITI-TXM corpora. By utilising the species information predicted by the hybrid tagger, our rule-based term identification system was improved significantly by up to 11.6%. CONCLUSION: This paper shows that, in the context of identifying terms involving multiple model organisms, integration of an accurate species disambiguation system can significantly improve the performance of term identification systems. |
format | Text |
id | pubmed-2586755 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-25867552008-11-26 Distinguishing the species of biomedical named entities for term identification Wang, Xinglong Matthews, Michael BMC Bioinformatics Research BACKGROUND: Term identification is the task of grounding ambiguous mentions of biomedical named entities in text to unique database identifiers. Previous work on term identification has focused on studying species-specific documents. However, full-length articles often describe entities across a number of species, in which case resolving the ambiguity of model organisms in entities is critical to achieving accurate term identification. RESULTS: We developed and compared a number of rule-based and machine-learning based approaches to resolving species ambiguity in mentions of biomedical named entities, and demonstrated that a hybrid method achieved the best overall accuracy at 71.7%, as tested on the gold-standard ITI-TXM corpora. By utilising the species information predicted by the hybrid tagger, our rule-based term identification system was improved significantly by up to 11.6%. CONCLUSION: This paper shows that, in the context of identifying terms involving multiple model organisms, integration of an accurate species disambiguation system can significantly improve the performance of term identification systems. BioMed Central 2008-11-19 /pmc/articles/PMC2586755/ /pubmed/19025692 http://dx.doi.org/10.1186/1471-2105-9-S11-S6 Text en Copyright © 2008 Wang and Matthews; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Wang, Xinglong Matthews, Michael Distinguishing the species of biomedical named entities for term identification |
title | Distinguishing the species of biomedical named entities for term identification |
title_full | Distinguishing the species of biomedical named entities for term identification |
title_fullStr | Distinguishing the species of biomedical named entities for term identification |
title_full_unstemmed | Distinguishing the species of biomedical named entities for term identification |
title_short | Distinguishing the species of biomedical named entities for term identification |
title_sort | distinguishing the species of biomedical named entities for term identification |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586755/ https://www.ncbi.nlm.nih.gov/pubmed/19025692 http://dx.doi.org/10.1186/1471-2105-9-S11-S6 |
work_keys_str_mv | AT wangxinglong distinguishingthespeciesofbiomedicalnamedentitiesfortermidentification AT matthewsmichael distinguishingthespeciesofbiomedicalnamedentitiesfortermidentification |