Cargando…

Topic Modeling for Interpretable Text Classification From EHRs

The clinical notes in electronic health records have many possibilities for predictive tasks in text classification. The interpretability of these classification models for the clinical domain is critical for decision making. Using topic models for text classification of electronic health records fo...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rijcken, Emil, Kaymak, Uzay, Scheepers, Floortje, Mosteiro, Pablo, Zervanou, Kalliopi, Spruit, Marco
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Big Data
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9114871/ https://www.ncbi.nlm.nih.gov/pubmed/35600326 http://dx.doi.org/10.3389/fdata.2022.846930

_version_	1784709874858328064
author	Rijcken, Emil Kaymak, Uzay Scheepers, Floortje Mosteiro, Pablo Zervanou, Kalliopi Spruit, Marco
author_facet	Rijcken, Emil Kaymak, Uzay Scheepers, Floortje Mosteiro, Pablo Zervanou, Kalliopi Spruit, Marco
author_sort	Rijcken, Emil
collection	PubMed
description	The clinical notes in electronic health records have many possibilities for predictive tasks in text classification. The interpretability of these classification models for the clinical domain is critical for decision making. Using topic models for text classification of electronic health records for a predictive task allows for the use of topics as features, thus making the text classification more interpretable. However, selecting the most effective topic model is not trivial. In this work, we propose considerations for selecting a suitable topic model based on the predictive performance and interpretability measure for text classification. We compare 17 different topic models in terms of both interpretability and predictive performance in an inpatient violence prediction task using clinical notes. We find no correlation between interpretability and predictive performance. In addition, our results show that although no model outperforms the other models on both variables, our proposed fuzzy topic modeling algorithm (FLSA-W) performs best in most settings for interpretability, whereas two state-of-the-art methods (ProdLDA and LSI) achieve the best predictive performance.
format	Online Article Text
id	pubmed-9114871
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-91148712022-05-19 Topic Modeling for Interpretable Text Classification From EHRs Rijcken, Emil Kaymak, Uzay Scheepers, Floortje Mosteiro, Pablo Zervanou, Kalliopi Spruit, Marco Front Big Data Big Data The clinical notes in electronic health records have many possibilities for predictive tasks in text classification. The interpretability of these classification models for the clinical domain is critical for decision making. Using topic models for text classification of electronic health records for a predictive task allows for the use of topics as features, thus making the text classification more interpretable. However, selecting the most effective topic model is not trivial. In this work, we propose considerations for selecting a suitable topic model based on the predictive performance and interpretability measure for text classification. We compare 17 different topic models in terms of both interpretability and predictive performance in an inpatient violence prediction task using clinical notes. We find no correlation between interpretability and predictive performance. In addition, our results show that although no model outperforms the other models on both variables, our proposed fuzzy topic modeling algorithm (FLSA-W) performs best in most settings for interpretability, whereas two state-of-the-art methods (ProdLDA and LSI) achieve the best predictive performance. Frontiers Media S.A. 2022-05-04 /pmc/articles/PMC9114871/ /pubmed/35600326 http://dx.doi.org/10.3389/fdata.2022.846930 Text en Copyright © 2022 Rijcken, Kaymak, Scheepers, Mosteiro, Zervanou and Spruit. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Big Data Rijcken, Emil Kaymak, Uzay Scheepers, Floortje Mosteiro, Pablo Zervanou, Kalliopi Spruit, Marco Topic Modeling for Interpretable Text Classification From EHRs
title	Topic Modeling for Interpretable Text Classification From EHRs
title_full	Topic Modeling for Interpretable Text Classification From EHRs
title_fullStr	Topic Modeling for Interpretable Text Classification From EHRs
title_full_unstemmed	Topic Modeling for Interpretable Text Classification From EHRs
title_short	Topic Modeling for Interpretable Text Classification From EHRs
title_sort	topic modeling for interpretable text classification from ehrs
topic	Big Data
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9114871/ https://www.ncbi.nlm.nih.gov/pubmed/35600326 http://dx.doi.org/10.3389/fdata.2022.846930
work_keys_str_mv	AT rijckenemil topicmodelingforinterpretabletextclassificationfromehrs AT kaymakuzay topicmodelingforinterpretabletextclassificationfromehrs AT scheepersfloortje topicmodelingforinterpretabletextclassificationfromehrs AT mosteiropablo topicmodelingforinterpretabletextclassificationfromehrs AT zervanoukalliopi topicmodelingforinterpretabletextclassificationfromehrs AT spruitmarco topicmodelingforinterpretabletextclassificationfromehrs

Topic Modeling for Interpretable Text Classification From EHRs

Ejemplares similares