Cargando…

Quality indices for topic model selection and evaluation: a literature review and case study

BACKGROUND: Topic models are a class of unsupervised machine learning models, which facilitate summarization, browsing and retrieval from large unstructured document collections. This study reviews several methods for assessing the quality of unsupervised topic models estimated using non-negative ma...

Descripción completa

Detalles Bibliográficos
Autores principales:	Meaney, Christopher, Stukel, Therese A., Austin, Peter C., Moineddin, Rahim, Greiver, Michelle, Escobar, Michael
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2023
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10362613/ https://www.ncbi.nlm.nih.gov/pubmed/37481523 http://dx.doi.org/10.1186/s12911-023-02216-1

_version_	1785076463201943552
author	Meaney, Christopher Stukel, Therese A. Austin, Peter C. Moineddin, Rahim Greiver, Michelle Escobar, Michael
author_facet	Meaney, Christopher Stukel, Therese A. Austin, Peter C. Moineddin, Rahim Greiver, Michelle Escobar, Michael
author_sort	Meaney, Christopher
collection	PubMed
description	BACKGROUND: Topic models are a class of unsupervised machine learning models, which facilitate summarization, browsing and retrieval from large unstructured document collections. This study reviews several methods for assessing the quality of unsupervised topic models estimated using non-negative matrix factorization. Techniques for topic model validation have been developed across disparate fields. We synthesize this literature, discuss the advantages and disadvantages of different techniques for topic model validation, and illustrate their usefulness for guiding model selection on a large clinical text corpus. DESIGN, SETTING AND DATA: Using a retrospective cohort design, we curated a text corpus containing 382,666 clinical notes collected between 01/01/2017 through 12/31/2020 from primary care electronic medical records in Toronto Canada. METHODS: Several topic model quality metrics have been proposed to assess different aspects of model fit. We explored the following metrics: reconstruction error, topic coherence, rank biased overlap, Kendall’s weighted tau, partition coefficient, partition entropy and the Xie-Beni statistic. Depending on context, cross-validation and/or bootstrap stability analysis were used to estimate these metrics on our corpus. RESULTS: Cross-validated reconstruction error favored large topic models (K ≥ 100 topics) on our corpus. Stability analysis using topic coherence and the Xie-Beni statistic also favored large models (K = 100 topics). Rank biased overlap and Kendall’s weighted tau favored small models (K = 5 topics). Few model evaluation metrics suggested mid-sized topic models (25 ≤ K ≤ 75) as being optimal. However, human judgement suggested that mid-sized topic models produced expressive low-dimensional summarizations of the corpus. CONCLUSIONS: Topic model quality indices are transparent quantitative tools for guiding model selection and evaluation. Our empirical illustration demonstrated that different topic model quality indices favor models of different complexity; and may not select models aligning with human judgment. This suggests that different metrics capture different aspects of model goodness of fit. A combination of topic model quality indices, coupled with human validation, may be useful in appraising unsupervised topic models. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-023-02216-1.
format	Online Article Text
id	pubmed-10362613
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-103626132023-07-23 Quality indices for topic model selection and evaluation: a literature review and case study Meaney, Christopher Stukel, Therese A. Austin, Peter C. Moineddin, Rahim Greiver, Michelle Escobar, Michael BMC Med Inform Decis Mak Research Article BACKGROUND: Topic models are a class of unsupervised machine learning models, which facilitate summarization, browsing and retrieval from large unstructured document collections. This study reviews several methods for assessing the quality of unsupervised topic models estimated using non-negative matrix factorization. Techniques for topic model validation have been developed across disparate fields. We synthesize this literature, discuss the advantages and disadvantages of different techniques for topic model validation, and illustrate their usefulness for guiding model selection on a large clinical text corpus. DESIGN, SETTING AND DATA: Using a retrospective cohort design, we curated a text corpus containing 382,666 clinical notes collected between 01/01/2017 through 12/31/2020 from primary care electronic medical records in Toronto Canada. METHODS: Several topic model quality metrics have been proposed to assess different aspects of model fit. We explored the following metrics: reconstruction error, topic coherence, rank biased overlap, Kendall’s weighted tau, partition coefficient, partition entropy and the Xie-Beni statistic. Depending on context, cross-validation and/or bootstrap stability analysis were used to estimate these metrics on our corpus. RESULTS: Cross-validated reconstruction error favored large topic models (K ≥ 100 topics) on our corpus. Stability analysis using topic coherence and the Xie-Beni statistic also favored large models (K = 100 topics). Rank biased overlap and Kendall’s weighted tau favored small models (K = 5 topics). Few model evaluation metrics suggested mid-sized topic models (25 ≤ K ≤ 75) as being optimal. However, human judgement suggested that mid-sized topic models produced expressive low-dimensional summarizations of the corpus. CONCLUSIONS: Topic model quality indices are transparent quantitative tools for guiding model selection and evaluation. Our empirical illustration demonstrated that different topic model quality indices favor models of different complexity; and may not select models aligning with human judgment. This suggests that different metrics capture different aspects of model goodness of fit. A combination of topic model quality indices, coupled with human validation, may be useful in appraising unsupervised topic models. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-023-02216-1. BioMed Central 2023-07-22 /pmc/articles/PMC10362613/ /pubmed/37481523 http://dx.doi.org/10.1186/s12911-023-02216-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Article Meaney, Christopher Stukel, Therese A. Austin, Peter C. Moineddin, Rahim Greiver, Michelle Escobar, Michael Quality indices for topic model selection and evaluation: a literature review and case study
title	Quality indices for topic model selection and evaluation: a literature review and case study
title_full	Quality indices for topic model selection and evaluation: a literature review and case study
title_fullStr	Quality indices for topic model selection and evaluation: a literature review and case study
title_full_unstemmed	Quality indices for topic model selection and evaluation: a literature review and case study
title_short	Quality indices for topic model selection and evaluation: a literature review and case study
title_sort	quality indices for topic model selection and evaluation: a literature review and case study
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10362613/ https://www.ncbi.nlm.nih.gov/pubmed/37481523 http://dx.doi.org/10.1186/s12911-023-02216-1
work_keys_str_mv	AT meaneychristopher qualityindicesfortopicmodelselectionandevaluationaliteraturereviewandcasestudy AT stukeltheresea qualityindicesfortopicmodelselectionandevaluationaliteraturereviewandcasestudy AT austinpeterc qualityindicesfortopicmodelselectionandevaluationaliteraturereviewandcasestudy AT moineddinrahim qualityindicesfortopicmodelselectionandevaluationaliteraturereviewandcasestudy AT greivermichelle qualityindicesfortopicmodelselectionandevaluationaliteraturereviewandcasestudy AT escobarmichael qualityindicesfortopicmodelselectionandevaluationaliteraturereviewandcasestudy

Quality indices for topic model selection and evaluation: a literature review and case study

Ejemplares similares