Cargando…

USI: a fast and accurate approach for conceptual document annotation

BACKGROUND: Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task...

Descripción completa

Detalles Bibliográficos
Autores principales: Fiorini, Nicolas, Ranwez, Sylvie, Montmain, Jacky, Ranwez, Vincent
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367850/
https://www.ncbi.nlm.nih.gov/pubmed/25887746
http://dx.doi.org/10.1186/s12859-015-0513-4
_version_ 1782362553843712000
author Fiorini, Nicolas
Ranwez, Sylvie
Montmain, Jacky
Ranwez, Vincent
author_facet Fiorini, Nicolas
Ranwez, Sylvie
Montmain, Jacky
Ranwez, Vincent
author_sort Fiorini, Nicolas
collection PubMed
description BACKGROUND: Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. RESULTS: In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. CONCLUSIONS: By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion — instead of one score per concept.
format Online
Article
Text
id pubmed-4367850
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43678502015-03-21 USI: a fast and accurate approach for conceptual document annotation Fiorini, Nicolas Ranwez, Sylvie Montmain, Jacky Ranwez, Vincent BMC Bioinformatics Research Article BACKGROUND: Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. RESULTS: In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. CONCLUSIONS: By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion — instead of one score per concept. BioMed Central 2015-03-14 /pmc/articles/PMC4367850/ /pubmed/25887746 http://dx.doi.org/10.1186/s12859-015-0513-4 Text en © Fiorini et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Fiorini, Nicolas
Ranwez, Sylvie
Montmain, Jacky
Ranwez, Vincent
USI: a fast and accurate approach for conceptual document annotation
title USI: a fast and accurate approach for conceptual document annotation
title_full USI: a fast and accurate approach for conceptual document annotation
title_fullStr USI: a fast and accurate approach for conceptual document annotation
title_full_unstemmed USI: a fast and accurate approach for conceptual document annotation
title_short USI: a fast and accurate approach for conceptual document annotation
title_sort usi: a fast and accurate approach for conceptual document annotation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367850/
https://www.ncbi.nlm.nih.gov/pubmed/25887746
http://dx.doi.org/10.1186/s12859-015-0513-4
work_keys_str_mv AT fiorininicolas usiafastandaccurateapproachforconceptualdocumentannotation
AT ranwezsylvie usiafastandaccurateapproachforconceptualdocumentannotation
AT montmainjacky usiafastandaccurateapproachforconceptualdocumentannotation
AT ranwezvincent usiafastandaccurateapproachforconceptualdocumentannotation