Cargando…
The impact of ordinal scales on Gaussian mixture recovery
Gaussian mixture models (GMMs) are a popular and versatile tool for exploring heterogeneity in multivariate continuous data. Arguably the most popular way to estimate GMMs is via the expectation–maximization (EM) algorithm combined with model selection using the Bayesian information criterion (BIC)....
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10250525/ https://www.ncbi.nlm.nih.gov/pubmed/35831565 http://dx.doi.org/10.3758/s13428-022-01883-8 |
_version_ | 1785055770922975232 |
---|---|
author | Haslbeck, Jonas M. B. Vermunt, Jeroen K. Waldorp, Lourens J. |
author_facet | Haslbeck, Jonas M. B. Vermunt, Jeroen K. Waldorp, Lourens J. |
author_sort | Haslbeck, Jonas M. B. |
collection | PubMed |
description | Gaussian mixture models (GMMs) are a popular and versatile tool for exploring heterogeneity in multivariate continuous data. Arguably the most popular way to estimate GMMs is via the expectation–maximization (EM) algorithm combined with model selection using the Bayesian information criterion (BIC). If the GMM is correctly specified, this estimation procedure has been demonstrated to have high recovery performance. However, in many situations, the data are not continuous but ordinal, for example when assessing symptom severity in medical data or modeling the responses in a survey. For such situations, it is unknown how well the EM algorithm and the BIC perform in GMM recovery. In the present paper, we investigate this question by simulating data from various GMMs, thresholding them in ordinal categories and evaluating recovery performance. We show that the number of components can be estimated reliably if the number of ordinal categories and the number of variables is high enough. However, the estimates of the parameters of the component models are biased independent of sample size. Finally, we discuss alternative modeling approaches which might be adopted for the situations in which estimating a GMM is not acceptable. |
format | Online Article Text |
id | pubmed-10250525 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-102505252023-06-10 The impact of ordinal scales on Gaussian mixture recovery Haslbeck, Jonas M. B. Vermunt, Jeroen K. Waldorp, Lourens J. Behav Res Methods Article Gaussian mixture models (GMMs) are a popular and versatile tool for exploring heterogeneity in multivariate continuous data. Arguably the most popular way to estimate GMMs is via the expectation–maximization (EM) algorithm combined with model selection using the Bayesian information criterion (BIC). If the GMM is correctly specified, this estimation procedure has been demonstrated to have high recovery performance. However, in many situations, the data are not continuous but ordinal, for example when assessing symptom severity in medical data or modeling the responses in a survey. For such situations, it is unknown how well the EM algorithm and the BIC perform in GMM recovery. In the present paper, we investigate this question by simulating data from various GMMs, thresholding them in ordinal categories and evaluating recovery performance. We show that the number of components can be estimated reliably if the number of ordinal categories and the number of variables is high enough. However, the estimates of the parameters of the component models are biased independent of sample size. Finally, we discuss alternative modeling approaches which might be adopted for the situations in which estimating a GMM is not acceptable. Springer US 2022-07-13 2023 /pmc/articles/PMC10250525/ /pubmed/35831565 http://dx.doi.org/10.3758/s13428-022-01883-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Haslbeck, Jonas M. B. Vermunt, Jeroen K. Waldorp, Lourens J. The impact of ordinal scales on Gaussian mixture recovery |
title | The impact of ordinal scales on Gaussian mixture recovery |
title_full | The impact of ordinal scales on Gaussian mixture recovery |
title_fullStr | The impact of ordinal scales on Gaussian mixture recovery |
title_full_unstemmed | The impact of ordinal scales on Gaussian mixture recovery |
title_short | The impact of ordinal scales on Gaussian mixture recovery |
title_sort | impact of ordinal scales on gaussian mixture recovery |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10250525/ https://www.ncbi.nlm.nih.gov/pubmed/35831565 http://dx.doi.org/10.3758/s13428-022-01883-8 |
work_keys_str_mv | AT haslbeckjonasmb theimpactofordinalscalesongaussianmixturerecovery AT vermuntjeroenk theimpactofordinalscalesongaussianmixturerecovery AT waldorplourensj theimpactofordinalscalesongaussianmixturerecovery AT haslbeckjonasmb impactofordinalscalesongaussianmixturerecovery AT vermuntjeroenk impactofordinalscalesongaussianmixturerecovery AT waldorplourensj impactofordinalscalesongaussianmixturerecovery |