Cargando…
Quantifying risk factors in medical reports with a context-aware linear model
OBJECTIVE: We seek to quantify the mortality risk associated with mentions of medical concepts in textual electronic health records (EHRs). Recognizing mentions of named entities of relevant types (eg, conditions, symptoms, laboratory tests or behaviors) in text is a well-researched task. However, d...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6515525/ https://www.ncbi.nlm.nih.gov/pubmed/30840055 http://dx.doi.org/10.1093/jamia/ocz004 |
_version_ | 1783418102578413568 |
---|---|
author | Przybyła, Piotr Brockmeier, Austin J Ananiadou, Sophia |
author_facet | Przybyła, Piotr Brockmeier, Austin J Ananiadou, Sophia |
author_sort | Przybyła, Piotr |
collection | PubMed |
description | OBJECTIVE: We seek to quantify the mortality risk associated with mentions of medical concepts in textual electronic health records (EHRs). Recognizing mentions of named entities of relevant types (eg, conditions, symptoms, laboratory tests or behaviors) in text is a well-researched task. However, determining the level of risk associated with them is partly dependent on the textual context in which they appear, which may describe severity, temporal aspects, quantity, etc. METHODS: To take into account that a given word appearing in the context of different risk factors (medical concepts) can make different contributions toward risk level, we propose a multitask approach, called context-aware linear modeling, which can be applied using appropriately regularized linear regression. To improve the performance for risk factors unseen in training data (eg, rare diseases), we take into account their distributional similarity to other concepts. RESULTS: The evaluation is based on a corpus of 531 reports from EHRs with 99 376 risk factors rated manually by experts. While context-aware linear modeling significantly outperforms single-task models, taking into account concept similarity further improves performance, reaching the level of human annotators’ agreements. CONCLUSION: Our results show that automatic quantification of risk factors in EHRs can achieve performance comparable to human assessment, and taking into account the multitask structure of the problem and the ability to handle rare concepts is crucial for its accuracy. |
format | Online Article Text |
id | pubmed-6515525 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-65155252019-05-20 Quantifying risk factors in medical reports with a context-aware linear model Przybyła, Piotr Brockmeier, Austin J Ananiadou, Sophia J Am Med Inform Assoc Research and Applications OBJECTIVE: We seek to quantify the mortality risk associated with mentions of medical concepts in textual electronic health records (EHRs). Recognizing mentions of named entities of relevant types (eg, conditions, symptoms, laboratory tests or behaviors) in text is a well-researched task. However, determining the level of risk associated with them is partly dependent on the textual context in which they appear, which may describe severity, temporal aspects, quantity, etc. METHODS: To take into account that a given word appearing in the context of different risk factors (medical concepts) can make different contributions toward risk level, we propose a multitask approach, called context-aware linear modeling, which can be applied using appropriately regularized linear regression. To improve the performance for risk factors unseen in training data (eg, rare diseases), we take into account their distributional similarity to other concepts. RESULTS: The evaluation is based on a corpus of 531 reports from EHRs with 99 376 risk factors rated manually by experts. While context-aware linear modeling significantly outperforms single-task models, taking into account concept similarity further improves performance, reaching the level of human annotators’ agreements. CONCLUSION: Our results show that automatic quantification of risk factors in EHRs can achieve performance comparable to human assessment, and taking into account the multitask structure of the problem and the ability to handle rare concepts is crucial for its accuracy. Oxford University Press 2019-03-06 /pmc/articles/PMC6515525/ /pubmed/30840055 http://dx.doi.org/10.1093/jamia/ocz004 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research and Applications Przybyła, Piotr Brockmeier, Austin J Ananiadou, Sophia Quantifying risk factors in medical reports with a context-aware linear model |
title | Quantifying risk factors in medical reports with a context-aware linear model |
title_full | Quantifying risk factors in medical reports with a context-aware linear model |
title_fullStr | Quantifying risk factors in medical reports with a context-aware linear model |
title_full_unstemmed | Quantifying risk factors in medical reports with a context-aware linear model |
title_short | Quantifying risk factors in medical reports with a context-aware linear model |
title_sort | quantifying risk factors in medical reports with a context-aware linear model |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6515525/ https://www.ncbi.nlm.nih.gov/pubmed/30840055 http://dx.doi.org/10.1093/jamia/ocz004 |
work_keys_str_mv | AT przybyłapiotr quantifyingriskfactorsinmedicalreportswithacontextawarelinearmodel AT brockmeieraustinj quantifyingriskfactorsinmedicalreportswithacontextawarelinearmodel AT ananiadousophia quantifyingriskfactorsinmedicalreportswithacontextawarelinearmodel |