Cargando…
Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean
INTRODUCTION: We aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC). RESEARCH DESIGN AND METHODS: We analyzed 13 population-based national su...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7849890/ https://www.ncbi.nlm.nih.gov/pubmed/33514531 http://dx.doi.org/10.1136/bmjdrc-2020-001889 |
_version_ | 1783645374878056448 |
---|---|
author | Carrillo-Larco, Rodrigo M Castillo-Cara, Manuel Anza-Ramirez, Cecilia Bernabé-Ortiz, Antonio |
author_facet | Carrillo-Larco, Rodrigo M Castillo-Cara, Manuel Anza-Ramirez, Cecilia Bernabé-Ortiz, Antonio |
author_sort | Carrillo-Larco, Rodrigo M |
collection | PubMed |
description | INTRODUCTION: We aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC). RESEARCH DESIGN AND METHODS: We analyzed 13 population-based national surveys in nine countries (n=8361). We used k-means to develop a clustering model; predictors were age, sex, body mass index (BMI), waist circumference (WC), systolic/diastolic blood pressure (SBP/DBP), and T2DM family history. The training data set included all surveys, and the clusters were then predicted in each country-year data set. We used Euclidean distance, elbow and silhouette plots to select the optimal number of clusters and described each cluster according to the underlying predictors (mean and proportions). RESULTS: The optimal number of clusters was 4. Cluster 0 grouped more men and those with the highest mean SBP/DBP. Cluster 1 had the highest mean BMI and WC, as well as the largest proportion of T2DM family history. We observed the smallest values of all predictors in cluster 2. Cluster 3 had the highest mean age. When we reflected the four clusters in each country-year data set, a different distribution was observed. For example, cluster 3 was the most frequent in the training data set, and so it was in 7 out of 13 other country-year data sets. CONCLUSIONS: Using unsupervised machine learning algorithms, it was possible to cluster people with T2DM from the general population in LAC; clusters showed unique profiles that could be used to identify the underlying characteristics of the T2DM population in LAC. |
format | Online Article Text |
id | pubmed-7849890 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-78498902021-02-02 Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean Carrillo-Larco, Rodrigo M Castillo-Cara, Manuel Anza-Ramirez, Cecilia Bernabé-Ortiz, Antonio BMJ Open Diabetes Res Care Epidemiology/Health services research INTRODUCTION: We aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC). RESEARCH DESIGN AND METHODS: We analyzed 13 population-based national surveys in nine countries (n=8361). We used k-means to develop a clustering model; predictors were age, sex, body mass index (BMI), waist circumference (WC), systolic/diastolic blood pressure (SBP/DBP), and T2DM family history. The training data set included all surveys, and the clusters were then predicted in each country-year data set. We used Euclidean distance, elbow and silhouette plots to select the optimal number of clusters and described each cluster according to the underlying predictors (mean and proportions). RESULTS: The optimal number of clusters was 4. Cluster 0 grouped more men and those with the highest mean SBP/DBP. Cluster 1 had the highest mean BMI and WC, as well as the largest proportion of T2DM family history. We observed the smallest values of all predictors in cluster 2. Cluster 3 had the highest mean age. When we reflected the four clusters in each country-year data set, a different distribution was observed. For example, cluster 3 was the most frequent in the training data set, and so it was in 7 out of 13 other country-year data sets. CONCLUSIONS: Using unsupervised machine learning algorithms, it was possible to cluster people with T2DM from the general population in LAC; clusters showed unique profiles that could be used to identify the underlying characteristics of the T2DM population in LAC. BMJ Publishing Group 2021-01-29 /pmc/articles/PMC7849890/ /pubmed/33514531 http://dx.doi.org/10.1136/bmjdrc-2020-001889 Text en © Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY. Published by BMJ. https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Epidemiology/Health services research Carrillo-Larco, Rodrigo M Castillo-Cara, Manuel Anza-Ramirez, Cecilia Bernabé-Ortiz, Antonio Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean |
title | Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean |
title_full | Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean |
title_fullStr | Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean |
title_full_unstemmed | Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean |
title_short | Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean |
title_sort | clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in latin america and the caribbean |
topic | Epidemiology/Health services research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7849890/ https://www.ncbi.nlm.nih.gov/pubmed/33514531 http://dx.doi.org/10.1136/bmjdrc-2020-001889 |
work_keys_str_mv | AT carrillolarcorodrigom clustersofpeoplewithtype2diabetesinthegeneralpopulationunsupervisedmachinelearningapproachusingnationalsurveysinlatinamericaandthecaribbean AT castillocaramanuel clustersofpeoplewithtype2diabetesinthegeneralpopulationunsupervisedmachinelearningapproachusingnationalsurveysinlatinamericaandthecaribbean AT anzaramirezcecilia clustersofpeoplewithtype2diabetesinthegeneralpopulationunsupervisedmachinelearningapproachusingnationalsurveysinlatinamericaandthecaribbean AT bernabeortizantonio clustersofpeoplewithtype2diabetesinthegeneralpopulationunsupervisedmachinelearningapproachusingnationalsurveysinlatinamericaandthecaribbean |