Cargando…
USI: a fast and accurate approach for conceptual document annotation
BACKGROUND: Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367850/ https://www.ncbi.nlm.nih.gov/pubmed/25887746 http://dx.doi.org/10.1186/s12859-015-0513-4 |
_version_ | 1782362553843712000 |
---|---|
author | Fiorini, Nicolas Ranwez, Sylvie Montmain, Jacky Ranwez, Vincent |
author_facet | Fiorini, Nicolas Ranwez, Sylvie Montmain, Jacky Ranwez, Vincent |
author_sort | Fiorini, Nicolas |
collection | PubMed |
description | BACKGROUND: Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. RESULTS: In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. CONCLUSIONS: By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion — instead of one score per concept. |
format | Online Article Text |
id | pubmed-4367850 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-43678502015-03-21 USI: a fast and accurate approach for conceptual document annotation Fiorini, Nicolas Ranwez, Sylvie Montmain, Jacky Ranwez, Vincent BMC Bioinformatics Research Article BACKGROUND: Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. RESULTS: In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. CONCLUSIONS: By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion — instead of one score per concept. BioMed Central 2015-03-14 /pmc/articles/PMC4367850/ /pubmed/25887746 http://dx.doi.org/10.1186/s12859-015-0513-4 Text en © Fiorini et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Fiorini, Nicolas Ranwez, Sylvie Montmain, Jacky Ranwez, Vincent USI: a fast and accurate approach for conceptual document annotation |
title | USI: a fast and accurate approach for conceptual document annotation |
title_full | USI: a fast and accurate approach for conceptual document annotation |
title_fullStr | USI: a fast and accurate approach for conceptual document annotation |
title_full_unstemmed | USI: a fast and accurate approach for conceptual document annotation |
title_short | USI: a fast and accurate approach for conceptual document annotation |
title_sort | usi: a fast and accurate approach for conceptual document annotation |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367850/ https://www.ncbi.nlm.nih.gov/pubmed/25887746 http://dx.doi.org/10.1186/s12859-015-0513-4 |
work_keys_str_mv | AT fiorininicolas usiafastandaccurateapproachforconceptualdocumentannotation AT ranwezsylvie usiafastandaccurateapproachforconceptualdocumentannotation AT montmainjacky usiafastandaccurateapproachforconceptualdocumentannotation AT ranwezvincent usiafastandaccurateapproachforconceptualdocumentannotation |