Cargando…
Annotation and extraction of age and temporally-related events from clinical histories
BACKGROUND: Age and time information stored within the histories of clinical notes can provide valuable insights for assessing a patient’s disease risk, understanding disease progression, and studying therapeutic outcomes. However, details of age and temporally-specified clinical events are not well...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7772895/ https://www.ncbi.nlm.nih.gov/pubmed/33380319 http://dx.doi.org/10.1186/s12911-020-01333-5 |
_version_ | 1783629959467630592 |
---|---|
author | Hong, Judy Davoudi, Anahita Yu, Shun Mowery, Danielle L. |
author_facet | Hong, Judy Davoudi, Anahita Yu, Shun Mowery, Danielle L. |
author_sort | Hong, Judy |
collection | PubMed |
description | BACKGROUND: Age and time information stored within the histories of clinical notes can provide valuable insights for assessing a patient’s disease risk, understanding disease progression, and studying therapeutic outcomes. However, details of age and temporally-specified clinical events are not well captured, consistently codified, and readily available to research databases for study. METHODS: We expanded upon existing annotation schemes to capture additional age and temporal information, conducted an annotation study to validate our expanded schema, and developed a prototypical, rule-based Named Entity Recognizer to extract our novel clinical named entities (NE). The annotation study was conducted on 138 discharge summaries from the pre-annotated 2014 ShARe/CLEF eHealth Challenge corpus. In addition to existing NE classes (TIMEX3, SUBJECT_CLASS, DISEASE_DISORDER), our schema proposes 3 additional NEs (AGE, PROCEDURE, OTHER_EVENTS). We also propose new attributes, e.g., “degree_relation” which captures the degree of biological relation for subjects annotated under SUBJECT_CLASS. As a proof of concept, we applied the schema to 49 H&P notes to encode pertinent history information for a lung cancer cohort study. RESULTS: An abundance of information was captured under the new OTHER_EVENTS, PROCEDURE and AGE classes, with 23%, 10% and 8% of all annotated NEs belonging to the above classes, respectively. We observed high inter-annotator agreement of >80% for AGE and TIMEX3; the automated NLP system achieved F1 scores of 86% (AGE) and 86% (TIMEX3). Age and temporally-specified mentions within past medical, family, surgical, and social histories were common in our lung cancer data set; annotation is ongoing to support this translational research study. CONCLUSIONS: Our annotation schema and NLP system can encode historical events from clinical notes to support clinical and translational research studies. |
format | Online Article Text |
id | pubmed-7772895 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-77728952020-12-30 Annotation and extraction of age and temporally-related events from clinical histories Hong, Judy Davoudi, Anahita Yu, Shun Mowery, Danielle L. BMC Med Inform Decis Mak Research BACKGROUND: Age and time information stored within the histories of clinical notes can provide valuable insights for assessing a patient’s disease risk, understanding disease progression, and studying therapeutic outcomes. However, details of age and temporally-specified clinical events are not well captured, consistently codified, and readily available to research databases for study. METHODS: We expanded upon existing annotation schemes to capture additional age and temporal information, conducted an annotation study to validate our expanded schema, and developed a prototypical, rule-based Named Entity Recognizer to extract our novel clinical named entities (NE). The annotation study was conducted on 138 discharge summaries from the pre-annotated 2014 ShARe/CLEF eHealth Challenge corpus. In addition to existing NE classes (TIMEX3, SUBJECT_CLASS, DISEASE_DISORDER), our schema proposes 3 additional NEs (AGE, PROCEDURE, OTHER_EVENTS). We also propose new attributes, e.g., “degree_relation” which captures the degree of biological relation for subjects annotated under SUBJECT_CLASS. As a proof of concept, we applied the schema to 49 H&P notes to encode pertinent history information for a lung cancer cohort study. RESULTS: An abundance of information was captured under the new OTHER_EVENTS, PROCEDURE and AGE classes, with 23%, 10% and 8% of all annotated NEs belonging to the above classes, respectively. We observed high inter-annotator agreement of >80% for AGE and TIMEX3; the automated NLP system achieved F1 scores of 86% (AGE) and 86% (TIMEX3). Age and temporally-specified mentions within past medical, family, surgical, and social histories were common in our lung cancer data set; annotation is ongoing to support this translational research study. CONCLUSIONS: Our annotation schema and NLP system can encode historical events from clinical notes to support clinical and translational research studies. BioMed Central 2020-12-30 /pmc/articles/PMC7772895/ /pubmed/33380319 http://dx.doi.org/10.1186/s12911-020-01333-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Hong, Judy Davoudi, Anahita Yu, Shun Mowery, Danielle L. Annotation and extraction of age and temporally-related events from clinical histories |
title | Annotation and extraction of age and temporally-related events from clinical histories |
title_full | Annotation and extraction of age and temporally-related events from clinical histories |
title_fullStr | Annotation and extraction of age and temporally-related events from clinical histories |
title_full_unstemmed | Annotation and extraction of age and temporally-related events from clinical histories |
title_short | Annotation and extraction of age and temporally-related events from clinical histories |
title_sort | annotation and extraction of age and temporally-related events from clinical histories |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7772895/ https://www.ncbi.nlm.nih.gov/pubmed/33380319 http://dx.doi.org/10.1186/s12911-020-01333-5 |
work_keys_str_mv | AT hongjudy annotationandextractionofageandtemporallyrelatedeventsfromclinicalhistories AT davoudianahita annotationandextractionofageandtemporallyrelatedeventsfromclinicalhistories AT yushun annotationandextractionofageandtemporallyrelatedeventsfromclinicalhistories AT mowerydaniellel annotationandextractionofageandtemporallyrelatedeventsfromclinicalhistories |