Cargando…

A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB

BACKGROUND: There are various methodological approaches to identifying clinically important subgroups and one method is to identify clusters of characteristics that differentiate people in cross-sectional and/or longitudinal data using Cluster Analysis (CA) or Latent Class Analysis (LCA). There is a...

Descripción completa

Detalles Bibliográficos
Autores principales: Kent, Peter, Jensen, Rikke K, Kongsted, Alice
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4192340/
https://www.ncbi.nlm.nih.gov/pubmed/25272975
http://dx.doi.org/10.1186/1471-2288-14-113
_version_ 1782338763043635200
author Kent, Peter
Jensen, Rikke K
Kongsted, Alice
author_facet Kent, Peter
Jensen, Rikke K
Kongsted, Alice
author_sort Kent, Peter
collection PubMed
description BACKGROUND: There are various methodological approaches to identifying clinically important subgroups and one method is to identify clusters of characteristics that differentiate people in cross-sectional and/or longitudinal data using Cluster Analysis (CA) or Latent Class Analysis (LCA). There is a scarcity of head-to-head comparisons that can inform the choice of which clustering method might be suitable for particular clinical datasets and research questions. Therefore, the aim of this study was to perform a head-to-head comparison of three commonly available methods (SPSS TwoStep CA, Latent Gold LCA and SNOB LCA). METHODS: The performance of these three methods was compared: (i) quantitatively using the number of subgroups detected, the classification probability of individuals into subgroups, the reproducibility of results, and (ii) qualitatively using subjective judgments about each program’s ease of use and interpretability of the presentation of results. We analysed five real datasets of varying complexity in a secondary analysis of data from other research projects. Three datasets contained only MRI findings (n = 2,060 to 20,810 vertebral disc levels), one dataset contained only pain intensity data collected for 52 weeks by text (SMS) messaging (n = 1,121 people), and the last dataset contained a range of clinical variables measured in low back pain patients (n = 543 people). Four artificial datasets (n = 1,000 each) containing subgroups of varying complexity were also analysed testing the ability of these clustering methods to detect subgroups and correctly classify individuals when subgroup membership was known. RESULTS: The results from the real clinical datasets indicated that the number of subgroups detected varied, the certainty of classifying individuals into those subgroups varied, the findings had perfect reproducibility, some programs were easier to use and the interpretability of the presentation of their findings also varied. The results from the artificial datasets indicated that all three clustering methods showed a near-perfect ability to detect known subgroups and correctly classify individuals into those subgroups. CONCLUSIONS: Our subjective judgement was that Latent Gold offered the best balance of sensitivity to subgroups, ease of use and presentation of results with these datasets but we recognise that different clustering methods may suit other types of data and clinical research questions.
format Online
Article
Text
id pubmed-4192340
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41923402014-10-11 A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB Kent, Peter Jensen, Rikke K Kongsted, Alice BMC Med Res Methodol Research Article BACKGROUND: There are various methodological approaches to identifying clinically important subgroups and one method is to identify clusters of characteristics that differentiate people in cross-sectional and/or longitudinal data using Cluster Analysis (CA) or Latent Class Analysis (LCA). There is a scarcity of head-to-head comparisons that can inform the choice of which clustering method might be suitable for particular clinical datasets and research questions. Therefore, the aim of this study was to perform a head-to-head comparison of three commonly available methods (SPSS TwoStep CA, Latent Gold LCA and SNOB LCA). METHODS: The performance of these three methods was compared: (i) quantitatively using the number of subgroups detected, the classification probability of individuals into subgroups, the reproducibility of results, and (ii) qualitatively using subjective judgments about each program’s ease of use and interpretability of the presentation of results. We analysed five real datasets of varying complexity in a secondary analysis of data from other research projects. Three datasets contained only MRI findings (n = 2,060 to 20,810 vertebral disc levels), one dataset contained only pain intensity data collected for 52 weeks by text (SMS) messaging (n = 1,121 people), and the last dataset contained a range of clinical variables measured in low back pain patients (n = 543 people). Four artificial datasets (n = 1,000 each) containing subgroups of varying complexity were also analysed testing the ability of these clustering methods to detect subgroups and correctly classify individuals when subgroup membership was known. RESULTS: The results from the real clinical datasets indicated that the number of subgroups detected varied, the certainty of classifying individuals into those subgroups varied, the findings had perfect reproducibility, some programs were easier to use and the interpretability of the presentation of their findings also varied. The results from the artificial datasets indicated that all three clustering methods showed a near-perfect ability to detect known subgroups and correctly classify individuals into those subgroups. CONCLUSIONS: Our subjective judgement was that Latent Gold offered the best balance of sensitivity to subgroups, ease of use and presentation of results with these datasets but we recognise that different clustering methods may suit other types of data and clinical research questions. BioMed Central 2014-10-02 /pmc/articles/PMC4192340/ /pubmed/25272975 http://dx.doi.org/10.1186/1471-2288-14-113 Text en © Kent et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Kent, Peter
Jensen, Rikke K
Kongsted, Alice
A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB
title A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB
title_full A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB
title_fullStr A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB
title_full_unstemmed A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB
title_short A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB
title_sort comparison of three clustering methods for finding subgroups in mri, sms or clinical data: spss twostep cluster analysis, latent gold and snob
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4192340/
https://www.ncbi.nlm.nih.gov/pubmed/25272975
http://dx.doi.org/10.1186/1471-2288-14-113
work_keys_str_mv AT kentpeter acomparisonofthreeclusteringmethodsforfindingsubgroupsinmrismsorclinicaldataspsstwostepclusteranalysislatentgoldandsnob
AT jensenrikkek acomparisonofthreeclusteringmethodsforfindingsubgroupsinmrismsorclinicaldataspsstwostepclusteranalysislatentgoldandsnob
AT kongstedalice acomparisonofthreeclusteringmethodsforfindingsubgroupsinmrismsorclinicaldataspsstwostepclusteranalysislatentgoldandsnob
AT kentpeter comparisonofthreeclusteringmethodsforfindingsubgroupsinmrismsorclinicaldataspsstwostepclusteranalysislatentgoldandsnob
AT jensenrikkek comparisonofthreeclusteringmethodsforfindingsubgroupsinmrismsorclinicaldataspsstwostepclusteranalysislatentgoldandsnob
AT kongstedalice comparisonofthreeclusteringmethodsforfindingsubgroupsinmrismsorclinicaldataspsstwostepclusteranalysislatentgoldandsnob