Cargando…

ANNO: A General Annotation Tool for Bilingual Clinical Note Information Extraction

OBJECTIVES: This study was conducted to develop a generalizable annotation tool for bilingual complex clinical text annotation, which led to the design and development of a clinical text annotation tool, ANNO. METHODS: We designed ANNO to enable human annotators to support the annotation of informat...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Kye Hwa, Lee, Hyunsung, Park, Jin-Hyeok, Kim, Yi-Jun, Lee, Youngho
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korean Society of Medical Informatics 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8850170/
https://www.ncbi.nlm.nih.gov/pubmed/35172094
http://dx.doi.org/10.4258/hir.2022.28.1.89
_version_ 1784652533609791488
author Lee, Kye Hwa
Lee, Hyunsung
Park, Jin-Hyeok
Kim, Yi-Jun
Lee, Youngho
author_facet Lee, Kye Hwa
Lee, Hyunsung
Park, Jin-Hyeok
Kim, Yi-Jun
Lee, Youngho
author_sort Lee, Kye Hwa
collection PubMed
description OBJECTIVES: This study was conducted to develop a generalizable annotation tool for bilingual complex clinical text annotation, which led to the design and development of a clinical text annotation tool, ANNO. METHODS: We designed ANNO to enable human annotators to support the annotation of information in clinical documents efficiently and accurately. First, annotations for different classes (word or phrase types) can be tagged according to the type of word using the dictionary function. In addition, it is possible to evaluate and reconcile differences by comparing annotation results between human annotators. Moreover, if the regular expression set for each class is updated during annotation, it is automatically reflected in the new document. The regular expression set created by human annotators is designed such that a word tagged once is automatically labeled in new documents. RESULTS: Because ANNO is a Docker-based web application, users can use it freely without being subjected to dependency issues. Human annotators can share their annotation markups as regular expression sets with a dictionary structure, and they can cross-check their annotated corpora with each other. The dictionary-based regular expression sharing function, cross-check function for each annotator, and standardized input (Microsoft Excel) and output (extensible markup language [XML]) formats are the main features of ANNO. CONCLUSIONS: With the growing need for massively annotated clinical data to support the development of machine learning models, we expect ANNO to be helpful to many researchers.
format Online
Article
Text
id pubmed-8850170
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Korean Society of Medical Informatics
record_format MEDLINE/PubMed
spelling pubmed-88501702022-02-26 ANNO: A General Annotation Tool for Bilingual Clinical Note Information Extraction Lee, Kye Hwa Lee, Hyunsung Park, Jin-Hyeok Kim, Yi-Jun Lee, Youngho Healthc Inform Res Case Report OBJECTIVES: This study was conducted to develop a generalizable annotation tool for bilingual complex clinical text annotation, which led to the design and development of a clinical text annotation tool, ANNO. METHODS: We designed ANNO to enable human annotators to support the annotation of information in clinical documents efficiently and accurately. First, annotations for different classes (word or phrase types) can be tagged according to the type of word using the dictionary function. In addition, it is possible to evaluate and reconcile differences by comparing annotation results between human annotators. Moreover, if the regular expression set for each class is updated during annotation, it is automatically reflected in the new document. The regular expression set created by human annotators is designed such that a word tagged once is automatically labeled in new documents. RESULTS: Because ANNO is a Docker-based web application, users can use it freely without being subjected to dependency issues. Human annotators can share their annotation markups as regular expression sets with a dictionary structure, and they can cross-check their annotated corpora with each other. The dictionary-based regular expression sharing function, cross-check function for each annotator, and standardized input (Microsoft Excel) and output (extensible markup language [XML]) formats are the main features of ANNO. CONCLUSIONS: With the growing need for massively annotated clinical data to support the development of machine learning models, we expect ANNO to be helpful to many researchers. Korean Society of Medical Informatics 2022-01 2022-01-31 /pmc/articles/PMC8850170/ /pubmed/35172094 http://dx.doi.org/10.4258/hir.2022.28.1.89 Text en © 2022 The Korean Society of Medical Informatics https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Case Report
Lee, Kye Hwa
Lee, Hyunsung
Park, Jin-Hyeok
Kim, Yi-Jun
Lee, Youngho
ANNO: A General Annotation Tool for Bilingual Clinical Note Information Extraction
title ANNO: A General Annotation Tool for Bilingual Clinical Note Information Extraction
title_full ANNO: A General Annotation Tool for Bilingual Clinical Note Information Extraction
title_fullStr ANNO: A General Annotation Tool for Bilingual Clinical Note Information Extraction
title_full_unstemmed ANNO: A General Annotation Tool for Bilingual Clinical Note Information Extraction
title_short ANNO: A General Annotation Tool for Bilingual Clinical Note Information Extraction
title_sort anno: a general annotation tool for bilingual clinical note information extraction
topic Case Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8850170/
https://www.ncbi.nlm.nih.gov/pubmed/35172094
http://dx.doi.org/10.4258/hir.2022.28.1.89
work_keys_str_mv AT leekyehwa annoageneralannotationtoolforbilingualclinicalnoteinformationextraction
AT leehyunsung annoageneralannotationtoolforbilingualclinicalnoteinformationextraction
AT parkjinhyeok annoageneralannotationtoolforbilingualclinicalnoteinformationextraction
AT kimyijun annoageneralannotationtoolforbilingualclinicalnoteinformationextraction
AT leeyoungho annoageneralannotationtoolforbilingualclinicalnoteinformationextraction