Cargando…

An extensive review of tools for manual annotation of documents

MOTIVATION: Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms. Further, annotation tools are also used to extract new information for a particular use case. However, owing to the high...

Descripción completa

Detalles Bibliográficos
Autores principales: Neves, Mariana, Ševa, Jurica
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7820865/
https://www.ncbi.nlm.nih.gov/pubmed/31838514
http://dx.doi.org/10.1093/bib/bbz130
_version_ 1783639301100142592
author Neves, Mariana
Ševa, Jurica
author_facet Neves, Mariana
Ševa, Jurica
author_sort Neves, Mariana
collection PubMed
description MOTIVATION: Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms. Further, annotation tools are also used to extract new information for a particular use case. However, owing to the high number of existing annotation tools, finding the one that best fits particular needs is a demanding task that requires searching the scientific literature followed by installing and trying various tools. METHODS: We searched for annotation tools and selected a subset of them according to five requirements with which they should comply, such as being Web-based or supporting the definition of a schema. We installed the selected tools (when necessary), carried out hands-on experiments and evaluated them using 26 criteria that covered functional and technical aspects. We defined each criterion on three levels of matches and a score for the final evaluation of the tools. RESULTS: We evaluated 78 tools and selected the following 15 for a detailed evaluation: BioQRator, brat, Catma, Djangology, ezTag, FLAT, LightTag, MAT, MyMiner, PDFAnno, prodigy, tagtog, TextAE, WAT-SL and WebAnno. Full compliance with our 26 criteria ranged from only 9 up to 20 criteria, which demonstrated that some tools are comprehensive and mature enough to be used on most annotation projects. The highest score of 0.81 was obtained by WebAnno (of a maximum value of 1.0).
format Online
Article
Text
id pubmed-7820865
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-78208652021-01-27 An extensive review of tools for manual annotation of documents Neves, Mariana Ševa, Jurica Brief Bioinform Articles MOTIVATION: Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms. Further, annotation tools are also used to extract new information for a particular use case. However, owing to the high number of existing annotation tools, finding the one that best fits particular needs is a demanding task that requires searching the scientific literature followed by installing and trying various tools. METHODS: We searched for annotation tools and selected a subset of them according to five requirements with which they should comply, such as being Web-based or supporting the definition of a schema. We installed the selected tools (when necessary), carried out hands-on experiments and evaluated them using 26 criteria that covered functional and technical aspects. We defined each criterion on three levels of matches and a score for the final evaluation of the tools. RESULTS: We evaluated 78 tools and selected the following 15 for a detailed evaluation: BioQRator, brat, Catma, Djangology, ezTag, FLAT, LightTag, MAT, MyMiner, PDFAnno, prodigy, tagtog, TextAE, WAT-SL and WebAnno. Full compliance with our 26 criteria ranged from only 9 up to 20 criteria, which demonstrated that some tools are comprehensive and mature enough to be used on most annotation projects. The highest score of 0.81 was obtained by WebAnno (of a maximum value of 1.0). Oxford University Press 2019-12-15 /pmc/articles/PMC7820865/ /pubmed/31838514 http://dx.doi.org/10.1093/bib/bbz130 Text en © The Author(s) 2019. Published by Oxford University Press. All rights reserved. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Articles
Neves, Mariana
Ševa, Jurica
An extensive review of tools for manual annotation of documents
title An extensive review of tools for manual annotation of documents
title_full An extensive review of tools for manual annotation of documents
title_fullStr An extensive review of tools for manual annotation of documents
title_full_unstemmed An extensive review of tools for manual annotation of documents
title_short An extensive review of tools for manual annotation of documents
title_sort extensive review of tools for manual annotation of documents
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7820865/
https://www.ncbi.nlm.nih.gov/pubmed/31838514
http://dx.doi.org/10.1093/bib/bbz130
work_keys_str_mv AT nevesmariana anextensivereviewoftoolsformanualannotationofdocuments
AT sevajurica anextensivereviewoftoolsformanualannotationofdocuments
AT nevesmariana extensivereviewoftoolsformanualannotationofdocuments
AT sevajurica extensivereviewoftoolsformanualannotationofdocuments