Cargando…

Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean

INTRODUCTION: We aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC). RESEARCH DESIGN AND METHODS: We analyzed 13 population-based national su...

Descripción completa

Detalles Bibliográficos
Autores principales: Carrillo-Larco, Rodrigo M, Castillo-Cara, Manuel, Anza-Ramirez, Cecilia, Bernabé-Ortiz, Antonio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7849890/
https://www.ncbi.nlm.nih.gov/pubmed/33514531
http://dx.doi.org/10.1136/bmjdrc-2020-001889
_version_ 1783645374878056448
author Carrillo-Larco, Rodrigo M
Castillo-Cara, Manuel
Anza-Ramirez, Cecilia
Bernabé-Ortiz, Antonio
author_facet Carrillo-Larco, Rodrigo M
Castillo-Cara, Manuel
Anza-Ramirez, Cecilia
Bernabé-Ortiz, Antonio
author_sort Carrillo-Larco, Rodrigo M
collection PubMed
description INTRODUCTION: We aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC). RESEARCH DESIGN AND METHODS: We analyzed 13 population-based national surveys in nine countries (n=8361). We used k-means to develop a clustering model; predictors were age, sex, body mass index (BMI), waist circumference (WC), systolic/diastolic blood pressure (SBP/DBP), and T2DM family history. The training data set included all surveys, and the clusters were then predicted in each country-year data set. We used Euclidean distance, elbow and silhouette plots to select the optimal number of clusters and described each cluster according to the underlying predictors (mean and proportions). RESULTS: The optimal number of clusters was 4. Cluster 0 grouped more men and those with the highest mean SBP/DBP. Cluster 1 had the highest mean BMI and WC, as well as the largest proportion of T2DM family history. We observed the smallest values of all predictors in cluster 2. Cluster 3 had the highest mean age. When we reflected the four clusters in each country-year data set, a different distribution was observed. For example, cluster 3 was the most frequent in the training data set, and so it was in 7 out of 13 other country-year data sets. CONCLUSIONS: Using unsupervised machine learning algorithms, it was possible to cluster people with T2DM from the general population in LAC; clusters showed unique profiles that could be used to identify the underlying characteristics of the T2DM population in LAC.
format Online
Article
Text
id pubmed-7849890
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-78498902021-02-02 Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean Carrillo-Larco, Rodrigo M Castillo-Cara, Manuel Anza-Ramirez, Cecilia Bernabé-Ortiz, Antonio BMJ Open Diabetes Res Care Epidemiology/Health services research INTRODUCTION: We aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC). RESEARCH DESIGN AND METHODS: We analyzed 13 population-based national surveys in nine countries (n=8361). We used k-means to develop a clustering model; predictors were age, sex, body mass index (BMI), waist circumference (WC), systolic/diastolic blood pressure (SBP/DBP), and T2DM family history. The training data set included all surveys, and the clusters were then predicted in each country-year data set. We used Euclidean distance, elbow and silhouette plots to select the optimal number of clusters and described each cluster according to the underlying predictors (mean and proportions). RESULTS: The optimal number of clusters was 4. Cluster 0 grouped more men and those with the highest mean SBP/DBP. Cluster 1 had the highest mean BMI and WC, as well as the largest proportion of T2DM family history. We observed the smallest values of all predictors in cluster 2. Cluster 3 had the highest mean age. When we reflected the four clusters in each country-year data set, a different distribution was observed. For example, cluster 3 was the most frequent in the training data set, and so it was in 7 out of 13 other country-year data sets. CONCLUSIONS: Using unsupervised machine learning algorithms, it was possible to cluster people with T2DM from the general population in LAC; clusters showed unique profiles that could be used to identify the underlying characteristics of the T2DM population in LAC. BMJ Publishing Group 2021-01-29 /pmc/articles/PMC7849890/ /pubmed/33514531 http://dx.doi.org/10.1136/bmjdrc-2020-001889 Text en © Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY. Published by BMJ. https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.
spellingShingle Epidemiology/Health services research
Carrillo-Larco, Rodrigo M
Castillo-Cara, Manuel
Anza-Ramirez, Cecilia
Bernabé-Ortiz, Antonio
Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean
title Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean
title_full Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean
title_fullStr Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean
title_full_unstemmed Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean
title_short Clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in Latin America and the Caribbean
title_sort clusters of people with type 2 diabetes in the general population: unsupervised machine learning approach using national surveys in latin america and the caribbean
topic Epidemiology/Health services research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7849890/
https://www.ncbi.nlm.nih.gov/pubmed/33514531
http://dx.doi.org/10.1136/bmjdrc-2020-001889
work_keys_str_mv AT carrillolarcorodrigom clustersofpeoplewithtype2diabetesinthegeneralpopulationunsupervisedmachinelearningapproachusingnationalsurveysinlatinamericaandthecaribbean
AT castillocaramanuel clustersofpeoplewithtype2diabetesinthegeneralpopulationunsupervisedmachinelearningapproachusingnationalsurveysinlatinamericaandthecaribbean
AT anzaramirezcecilia clustersofpeoplewithtype2diabetesinthegeneralpopulationunsupervisedmachinelearningapproachusingnationalsurveysinlatinamericaandthecaribbean
AT bernabeortizantonio clustersofpeoplewithtype2diabetesinthegeneralpopulationunsupervisedmachinelearningapproachusingnationalsurveysinlatinamericaandthecaribbean