Cargando…

Annotation and extraction of age and temporally-related events from clinical histories

BACKGROUND: Age and time information stored within the histories of clinical notes can provide valuable insights for assessing a patient’s disease risk, understanding disease progression, and studying therapeutic outcomes. However, details of age and temporally-specified clinical events are not well...

Descripción completa

Detalles Bibliográficos
Autores principales: Hong, Judy, Davoudi, Anahita, Yu, Shun, Mowery, Danielle L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7772895/
https://www.ncbi.nlm.nih.gov/pubmed/33380319
http://dx.doi.org/10.1186/s12911-020-01333-5
_version_ 1783629959467630592
author Hong, Judy
Davoudi, Anahita
Yu, Shun
Mowery, Danielle L.
author_facet Hong, Judy
Davoudi, Anahita
Yu, Shun
Mowery, Danielle L.
author_sort Hong, Judy
collection PubMed
description BACKGROUND: Age and time information stored within the histories of clinical notes can provide valuable insights for assessing a patient’s disease risk, understanding disease progression, and studying therapeutic outcomes. However, details of age and temporally-specified clinical events are not well captured, consistently codified, and readily available to research databases for study. METHODS: We expanded upon existing annotation schemes to capture additional age and temporal information, conducted an annotation study to validate our expanded schema, and developed a prototypical, rule-based Named Entity Recognizer to extract our novel clinical named entities (NE). The annotation study was conducted on 138 discharge summaries from the pre-annotated 2014 ShARe/CLEF eHealth Challenge corpus. In addition to existing NE classes (TIMEX3, SUBJECT_CLASS, DISEASE_DISORDER), our schema proposes 3 additional NEs (AGE, PROCEDURE, OTHER_EVENTS). We also propose new attributes, e.g., “degree_relation” which captures the degree of biological relation for subjects annotated under SUBJECT_CLASS. As a proof of concept, we applied the schema to 49 H&P notes to encode pertinent history information for a lung cancer cohort study. RESULTS: An abundance of information was captured under the new OTHER_EVENTS, PROCEDURE and AGE classes, with 23%, 10% and 8% of all annotated NEs belonging to the above classes, respectively. We observed high inter-annotator agreement of >80% for AGE and TIMEX3; the automated NLP system achieved F1 scores of 86% (AGE) and 86% (TIMEX3). Age and temporally-specified mentions within past medical, family, surgical, and social histories were common in our lung cancer data set; annotation is ongoing to support this translational research study. CONCLUSIONS: Our annotation schema and NLP system can encode historical events from clinical notes to support clinical and translational research studies.
format Online
Article
Text
id pubmed-7772895
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-77728952020-12-30 Annotation and extraction of age and temporally-related events from clinical histories Hong, Judy Davoudi, Anahita Yu, Shun Mowery, Danielle L. BMC Med Inform Decis Mak Research BACKGROUND: Age and time information stored within the histories of clinical notes can provide valuable insights for assessing a patient’s disease risk, understanding disease progression, and studying therapeutic outcomes. However, details of age and temporally-specified clinical events are not well captured, consistently codified, and readily available to research databases for study. METHODS: We expanded upon existing annotation schemes to capture additional age and temporal information, conducted an annotation study to validate our expanded schema, and developed a prototypical, rule-based Named Entity Recognizer to extract our novel clinical named entities (NE). The annotation study was conducted on 138 discharge summaries from the pre-annotated 2014 ShARe/CLEF eHealth Challenge corpus. In addition to existing NE classes (TIMEX3, SUBJECT_CLASS, DISEASE_DISORDER), our schema proposes 3 additional NEs (AGE, PROCEDURE, OTHER_EVENTS). We also propose new attributes, e.g., “degree_relation” which captures the degree of biological relation for subjects annotated under SUBJECT_CLASS. As a proof of concept, we applied the schema to 49 H&P notes to encode pertinent history information for a lung cancer cohort study. RESULTS: An abundance of information was captured under the new OTHER_EVENTS, PROCEDURE and AGE classes, with 23%, 10% and 8% of all annotated NEs belonging to the above classes, respectively. We observed high inter-annotator agreement of >80% for AGE and TIMEX3; the automated NLP system achieved F1 scores of 86% (AGE) and 86% (TIMEX3). Age and temporally-specified mentions within past medical, family, surgical, and social histories were common in our lung cancer data set; annotation is ongoing to support this translational research study. CONCLUSIONS: Our annotation schema and NLP system can encode historical events from clinical notes to support clinical and translational research studies. BioMed Central 2020-12-30 /pmc/articles/PMC7772895/ /pubmed/33380319 http://dx.doi.org/10.1186/s12911-020-01333-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Hong, Judy
Davoudi, Anahita
Yu, Shun
Mowery, Danielle L.
Annotation and extraction of age and temporally-related events from clinical histories
title Annotation and extraction of age and temporally-related events from clinical histories
title_full Annotation and extraction of age and temporally-related events from clinical histories
title_fullStr Annotation and extraction of age and temporally-related events from clinical histories
title_full_unstemmed Annotation and extraction of age and temporally-related events from clinical histories
title_short Annotation and extraction of age and temporally-related events from clinical histories
title_sort annotation and extraction of age and temporally-related events from clinical histories
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7772895/
https://www.ncbi.nlm.nih.gov/pubmed/33380319
http://dx.doi.org/10.1186/s12911-020-01333-5
work_keys_str_mv AT hongjudy annotationandextractionofageandtemporallyrelatedeventsfromclinicalhistories
AT davoudianahita annotationandextractionofageandtemporallyrelatedeventsfromclinicalhistories
AT yushun annotationandextractionofageandtemporallyrelatedeventsfromclinicalhistories
AT mowerydaniellel annotationandextractionofageandtemporallyrelatedeventsfromclinicalhistories