Cargando…

Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding

Applying methods in natural language processing on electronic health records (EHR) data is a growing field. Existing corpus and annotation focus on modeling textual features and relation prediction. However, there is a paucity of annotated corpus built to model clinical diagnostic thinking, a proces...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Yanjun, Dligach, Dmitriy, Miller, Timothy, Tesch, Samuel, Laffin, Ryan, Churpek, Matthew M., Afshar, Majid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9354726/
https://www.ncbi.nlm.nih.gov/pubmed/35939277
_version_ 1784763135331139584
author Gao, Yanjun
Dligach, Dmitriy
Miller, Timothy
Tesch, Samuel
Laffin, Ryan
Churpek, Matthew M.
Afshar, Majid
author_facet Gao, Yanjun
Dligach, Dmitriy
Miller, Timothy
Tesch, Samuel
Laffin, Ryan
Churpek, Matthew M.
Afshar, Majid
author_sort Gao, Yanjun
collection PubMed
description Applying methods in natural language processing on electronic health records (EHR) data is a growing field. Existing corpus and annotation focus on modeling textual features and relation prediction. However, there is a paucity of annotated corpus built to model clinical diagnostic thinking, a process involving text understanding, domain knowledge abstraction and reasoning. This work introduces a hierarchical annotation schema with three stages to address clinical text understanding, clinical reasoning, and summarization. We created an annotated corpus based on an extensive collection of publicly available daily progress notes, a type of EHR documentation that is collected in time series in a problem-oriented format. The conventional format for a progress note follows a Subjective, Objective, Assessment and Plan heading (SOAP). We also define a new suite of tasks, Progress Note Understanding, with three tasks utilizing the three annotation stages. The novel suite of tasks was designed to train and evaluate future NLP models for clinical text understanding, clinical knowledge representation, inference, and summarization.
format Online
Article
Text
id pubmed-9354726
institution National Center for Biotechnology Information
language English
publishDate 2022
record_format MEDLINE/PubMed
spelling pubmed-93547262022-08-05 Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding Gao, Yanjun Dligach, Dmitriy Miller, Timothy Tesch, Samuel Laffin, Ryan Churpek, Matthew M. Afshar, Majid LREC Int Conf Lang Resour Eval Article Applying methods in natural language processing on electronic health records (EHR) data is a growing field. Existing corpus and annotation focus on modeling textual features and relation prediction. However, there is a paucity of annotated corpus built to model clinical diagnostic thinking, a process involving text understanding, domain knowledge abstraction and reasoning. This work introduces a hierarchical annotation schema with three stages to address clinical text understanding, clinical reasoning, and summarization. We created an annotated corpus based on an extensive collection of publicly available daily progress notes, a type of EHR documentation that is collected in time series in a problem-oriented format. The conventional format for a progress note follows a Subjective, Objective, Assessment and Plan heading (SOAP). We also define a new suite of tasks, Progress Note Understanding, with three tasks utilizing the three annotation stages. The novel suite of tasks was designed to train and evaluate future NLP models for clinical text understanding, clinical knowledge representation, inference, and summarization. 2022-06 /pmc/articles/PMC9354726/ /pubmed/35939277 Text en https://creativecommons.org/licenses/by-nc/4.0/European Language Resources Association (ELRA), licensed under CC-BY-NC-4.0
spellingShingle Article
Gao, Yanjun
Dligach, Dmitriy
Miller, Timothy
Tesch, Samuel
Laffin, Ryan
Churpek, Matthew M.
Afshar, Majid
Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding
title Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding
title_full Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding
title_fullStr Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding
title_full_unstemmed Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding
title_short Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding
title_sort hierarchical annotation for building a suite of clinical natural language processing tasks: progress note understanding
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9354726/
https://www.ncbi.nlm.nih.gov/pubmed/35939277
work_keys_str_mv AT gaoyanjun hierarchicalannotationforbuildingasuiteofclinicalnaturallanguageprocessingtasksprogressnoteunderstanding
AT dligachdmitriy hierarchicalannotationforbuildingasuiteofclinicalnaturallanguageprocessingtasksprogressnoteunderstanding
AT millertimothy hierarchicalannotationforbuildingasuiteofclinicalnaturallanguageprocessingtasksprogressnoteunderstanding
AT teschsamuel hierarchicalannotationforbuildingasuiteofclinicalnaturallanguageprocessingtasksprogressnoteunderstanding
AT laffinryan hierarchicalannotationforbuildingasuiteofclinicalnaturallanguageprocessingtasksprogressnoteunderstanding
AT churpekmatthewm hierarchicalannotationforbuildingasuiteofclinicalnaturallanguageprocessingtasksprogressnoteunderstanding
AT afsharmajid hierarchicalannotationforbuildingasuiteofclinicalnaturallanguageprocessingtasksprogressnoteunderstanding