Cargando…

Mining heterogeneous clinical notes by multi-modal latent topic model

Latent knowledge can be extracted from the electronic notes that are recorded during patient encounters with the health system. Using these clinical notes to decipher a patient’s underlying comorbidites, symptom burdens, and treatment courses is an ongoing challenge. Latent topic model as an efficie...

Descripción completa

Detalles Bibliográficos
Autores principales: Wen, Zhi, Nair, Pratheeksha, Deng, Chih-Ying, Lu, Xing Han, Moseley, Edward, George, Naomi, Lindvall, Charlotta, Li, Yue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8031429/
https://www.ncbi.nlm.nih.gov/pubmed/33831055
http://dx.doi.org/10.1371/journal.pone.0249622
_version_ 1783676163080585216
author Wen, Zhi
Nair, Pratheeksha
Deng, Chih-Ying
Lu, Xing Han
Moseley, Edward
George, Naomi
Lindvall, Charlotta
Li, Yue
author_facet Wen, Zhi
Nair, Pratheeksha
Deng, Chih-Ying
Lu, Xing Han
Moseley, Edward
George, Naomi
Lindvall, Charlotta
Li, Yue
author_sort Wen, Zhi
collection PubMed
description Latent knowledge can be extracted from the electronic notes that are recorded during patient encounters with the health system. Using these clinical notes to decipher a patient’s underlying comorbidites, symptom burdens, and treatment courses is an ongoing challenge. Latent topic model as an efficient Bayesian method can be used to model each patient’s clinical notes as “documents” and the words in the notes as “tokens”. However, standard latent topic models assume that all of the notes follow the same topic distribution, regardless of the type of note or the domain expertise of the author (such as doctors or nurses). We propose a novel application of latent topic modeling, using multi-note topic model (MNTM) to jointly infer distinct topic distributions of notes of different types. We applied our model to clinical notes from the MIMIC-III dataset to infer distinct topic distributions over the physician and nursing note types. Based on manual assessments made by clinicians, we observed a significant improvement in topic interpretability using MNTM modeling over the baseline single-note topic models that ignore the note types. Moreover, our MNTM model led to a significantly higher prediction accuracy for prolonged mechanical ventilation and mortality using only the first 48 hours of patient data. By correlating the patients’ topic mixture with hospital mortality and prolonged mechanical ventilation, we identified several diagnostic topics that are associated with poor outcomes. Because of its elegant and intuitive formation, we envision a broad application of our approach in mining multi-modality text-based healthcare information that goes beyond clinical notes. Code available at https://github.com/li-lab-mcgill/heterogeneous_ehr.
format Online
Article
Text
id pubmed-8031429
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80314292021-04-14 Mining heterogeneous clinical notes by multi-modal latent topic model Wen, Zhi Nair, Pratheeksha Deng, Chih-Ying Lu, Xing Han Moseley, Edward George, Naomi Lindvall, Charlotta Li, Yue PLoS One Research Article Latent knowledge can be extracted from the electronic notes that are recorded during patient encounters with the health system. Using these clinical notes to decipher a patient’s underlying comorbidites, symptom burdens, and treatment courses is an ongoing challenge. Latent topic model as an efficient Bayesian method can be used to model each patient’s clinical notes as “documents” and the words in the notes as “tokens”. However, standard latent topic models assume that all of the notes follow the same topic distribution, regardless of the type of note or the domain expertise of the author (such as doctors or nurses). We propose a novel application of latent topic modeling, using multi-note topic model (MNTM) to jointly infer distinct topic distributions of notes of different types. We applied our model to clinical notes from the MIMIC-III dataset to infer distinct topic distributions over the physician and nursing note types. Based on manual assessments made by clinicians, we observed a significant improvement in topic interpretability using MNTM modeling over the baseline single-note topic models that ignore the note types. Moreover, our MNTM model led to a significantly higher prediction accuracy for prolonged mechanical ventilation and mortality using only the first 48 hours of patient data. By correlating the patients’ topic mixture with hospital mortality and prolonged mechanical ventilation, we identified several diagnostic topics that are associated with poor outcomes. Because of its elegant and intuitive formation, we envision a broad application of our approach in mining multi-modality text-based healthcare information that goes beyond clinical notes. Code available at https://github.com/li-lab-mcgill/heterogeneous_ehr. Public Library of Science 2021-04-08 /pmc/articles/PMC8031429/ /pubmed/33831055 http://dx.doi.org/10.1371/journal.pone.0249622 Text en © 2021 Wen et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wen, Zhi
Nair, Pratheeksha
Deng, Chih-Ying
Lu, Xing Han
Moseley, Edward
George, Naomi
Lindvall, Charlotta
Li, Yue
Mining heterogeneous clinical notes by multi-modal latent topic model
title Mining heterogeneous clinical notes by multi-modal latent topic model
title_full Mining heterogeneous clinical notes by multi-modal latent topic model
title_fullStr Mining heterogeneous clinical notes by multi-modal latent topic model
title_full_unstemmed Mining heterogeneous clinical notes by multi-modal latent topic model
title_short Mining heterogeneous clinical notes by multi-modal latent topic model
title_sort mining heterogeneous clinical notes by multi-modal latent topic model
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8031429/
https://www.ncbi.nlm.nih.gov/pubmed/33831055
http://dx.doi.org/10.1371/journal.pone.0249622
work_keys_str_mv AT wenzhi miningheterogeneousclinicalnotesbymultimodallatenttopicmodel
AT nairpratheeksha miningheterogeneousclinicalnotesbymultimodallatenttopicmodel
AT dengchihying miningheterogeneousclinicalnotesbymultimodallatenttopicmodel
AT luxinghan miningheterogeneousclinicalnotesbymultimodallatenttopicmodel
AT moseleyedward miningheterogeneousclinicalnotesbymultimodallatenttopicmodel
AT georgenaomi miningheterogeneousclinicalnotesbymultimodallatenttopicmodel
AT lindvallcharlotta miningheterogeneousclinicalnotesbymultimodallatenttopicmodel
AT liyue miningheterogeneousclinicalnotesbymultimodallatenttopicmodel