Cargando…

Quantifying risk factors in medical reports with a context-aware linear model

OBJECTIVE: We seek to quantify the mortality risk associated with mentions of medical concepts in textual electronic health records (EHRs). Recognizing mentions of named entities of relevant types (eg, conditions, symptoms, laboratory tests or behaviors) in text is a well-researched task. However, d...

Descripción completa

Detalles Bibliográficos
Autores principales: Przybyła, Piotr, Brockmeier, Austin J, Ananiadou, Sophia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6515525/
https://www.ncbi.nlm.nih.gov/pubmed/30840055
http://dx.doi.org/10.1093/jamia/ocz004
_version_ 1783418102578413568
author Przybyła, Piotr
Brockmeier, Austin J
Ananiadou, Sophia
author_facet Przybyła, Piotr
Brockmeier, Austin J
Ananiadou, Sophia
author_sort Przybyła, Piotr
collection PubMed
description OBJECTIVE: We seek to quantify the mortality risk associated with mentions of medical concepts in textual electronic health records (EHRs). Recognizing mentions of named entities of relevant types (eg, conditions, symptoms, laboratory tests or behaviors) in text is a well-researched task. However, determining the level of risk associated with them is partly dependent on the textual context in which they appear, which may describe severity, temporal aspects, quantity, etc. METHODS: To take into account that a given word appearing in the context of different risk factors (medical concepts) can make different contributions toward risk level, we propose a multitask approach, called context-aware linear modeling, which can be applied using appropriately regularized linear regression. To improve the performance for risk factors unseen in training data (eg, rare diseases), we take into account their distributional similarity to other concepts. RESULTS: The evaluation is based on a corpus of 531 reports from EHRs with 99 376 risk factors rated manually by experts. While context-aware linear modeling significantly outperforms single-task models, taking into account concept similarity further improves performance, reaching the level of human annotators’ agreements. CONCLUSION: Our results show that automatic quantification of risk factors in EHRs can achieve performance comparable to human assessment, and taking into account the multitask structure of the problem and the ability to handle rare concepts is crucial for its accuracy.
format Online
Article
Text
id pubmed-6515525
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-65155252019-05-20 Quantifying risk factors in medical reports with a context-aware linear model Przybyła, Piotr Brockmeier, Austin J Ananiadou, Sophia J Am Med Inform Assoc Research and Applications OBJECTIVE: We seek to quantify the mortality risk associated with mentions of medical concepts in textual electronic health records (EHRs). Recognizing mentions of named entities of relevant types (eg, conditions, symptoms, laboratory tests or behaviors) in text is a well-researched task. However, determining the level of risk associated with them is partly dependent on the textual context in which they appear, which may describe severity, temporal aspects, quantity, etc. METHODS: To take into account that a given word appearing in the context of different risk factors (medical concepts) can make different contributions toward risk level, we propose a multitask approach, called context-aware linear modeling, which can be applied using appropriately regularized linear regression. To improve the performance for risk factors unseen in training data (eg, rare diseases), we take into account their distributional similarity to other concepts. RESULTS: The evaluation is based on a corpus of 531 reports from EHRs with 99 376 risk factors rated manually by experts. While context-aware linear modeling significantly outperforms single-task models, taking into account concept similarity further improves performance, reaching the level of human annotators’ agreements. CONCLUSION: Our results show that automatic quantification of risk factors in EHRs can achieve performance comparable to human assessment, and taking into account the multitask structure of the problem and the ability to handle rare concepts is crucial for its accuracy. Oxford University Press 2019-03-06 /pmc/articles/PMC6515525/ /pubmed/30840055 http://dx.doi.org/10.1093/jamia/ocz004 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research and Applications
Przybyła, Piotr
Brockmeier, Austin J
Ananiadou, Sophia
Quantifying risk factors in medical reports with a context-aware linear model
title Quantifying risk factors in medical reports with a context-aware linear model
title_full Quantifying risk factors in medical reports with a context-aware linear model
title_fullStr Quantifying risk factors in medical reports with a context-aware linear model
title_full_unstemmed Quantifying risk factors in medical reports with a context-aware linear model
title_short Quantifying risk factors in medical reports with a context-aware linear model
title_sort quantifying risk factors in medical reports with a context-aware linear model
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6515525/
https://www.ncbi.nlm.nih.gov/pubmed/30840055
http://dx.doi.org/10.1093/jamia/ocz004
work_keys_str_mv AT przybyłapiotr quantifyingriskfactorsinmedicalreportswithacontextawarelinearmodel
AT brockmeieraustinj quantifyingriskfactorsinmedicalreportswithacontextawarelinearmodel
AT ananiadousophia quantifyingriskfactorsinmedicalreportswithacontextawarelinearmodel