Cargando…

Risk prediction of chronic diseases with a two-stage semi-supervised clustering method

Early detection of chronic diseases such as cardiovascular disease (CVD) and diabetes can make the difference between life and death. Previous studies have demonstrated the feasibility of disease diagnosis and prediction using machine learning and disease-indicating biomarkers. The aim of this study...

Descripción completa

Detalles Bibliográficos
Autores principales: Mao, Zaixing, Fukuma, Yasufumi, Tsukada, Hisashi, Wada, Satoshi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9929447/
https://www.ncbi.nlm.nih.gov/pubmed/36816765
http://dx.doi.org/10.1016/j.pmedr.2023.102129
_version_ 1784888857624313856
author Mao, Zaixing
Fukuma, Yasufumi
Tsukada, Hisashi
Wada, Satoshi
author_facet Mao, Zaixing
Fukuma, Yasufumi
Tsukada, Hisashi
Wada, Satoshi
author_sort Mao, Zaixing
collection PubMed
description Early detection of chronic diseases such as cardiovascular disease (CVD) and diabetes can make the difference between life and death. Previous studies have demonstrated the feasibility of disease diagnosis and prediction using machine learning and disease-indicating biomarkers. The aim of this study is to develop a method to detect the risk of future disease even when disease-indicating biomarker readings are in the normal range. Data from the US Centers for Disease Control and Prevention (CDC) National Health and Nutrition Examination Surveys (NHANES) are used for this study. A two-stage semi-supervised K-Means (SSK-Means) clustering approach was developed to identify the underlying risk of each individual and categorize them into high or low-risk groups for CVD and diabetes. Our developed method of classification can identify groups as high risk or low risk, even if they would have been considered normal using traditional biomarker threshold criteria. For CVD, the SSK-Means clustering results showed that individuals over 30 years of age in the high-risk group were almost twice as likely to develop CVD as individuals in the low-risk group. For diabetes, the SSK-Means clustering results showed that individuals over 50 years in the high-risk group have at least two times the risk of developing diabetes compared with individuals in the low-risk group.
format Online
Article
Text
id pubmed-9929447
institution National Center for Biotechnology Information
language English
publishDate 2023
record_format MEDLINE/PubMed
spelling pubmed-99294472023-02-16 Risk prediction of chronic diseases with a two-stage semi-supervised clustering method Mao, Zaixing Fukuma, Yasufumi Tsukada, Hisashi Wada, Satoshi Prev Med Rep Regular Article Early detection of chronic diseases such as cardiovascular disease (CVD) and diabetes can make the difference between life and death. Previous studies have demonstrated the feasibility of disease diagnosis and prediction using machine learning and disease-indicating biomarkers. The aim of this study is to develop a method to detect the risk of future disease even when disease-indicating biomarker readings are in the normal range. Data from the US Centers for Disease Control and Prevention (CDC) National Health and Nutrition Examination Surveys (NHANES) are used for this study. A two-stage semi-supervised K-Means (SSK-Means) clustering approach was developed to identify the underlying risk of each individual and categorize them into high or low-risk groups for CVD and diabetes. Our developed method of classification can identify groups as high risk or low risk, even if they would have been considered normal using traditional biomarker threshold criteria. For CVD, the SSK-Means clustering results showed that individuals over 30 years of age in the high-risk group were almost twice as likely to develop CVD as individuals in the low-risk group. For diabetes, the SSK-Means clustering results showed that individuals over 50 years in the high-risk group have at least two times the risk of developing diabetes compared with individuals in the low-risk group. 2023-02-06 /pmc/articles/PMC9929447/ /pubmed/36816765 http://dx.doi.org/10.1016/j.pmedr.2023.102129 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Regular Article
Mao, Zaixing
Fukuma, Yasufumi
Tsukada, Hisashi
Wada, Satoshi
Risk prediction of chronic diseases with a two-stage semi-supervised clustering method
title Risk prediction of chronic diseases with a two-stage semi-supervised clustering method
title_full Risk prediction of chronic diseases with a two-stage semi-supervised clustering method
title_fullStr Risk prediction of chronic diseases with a two-stage semi-supervised clustering method
title_full_unstemmed Risk prediction of chronic diseases with a two-stage semi-supervised clustering method
title_short Risk prediction of chronic diseases with a two-stage semi-supervised clustering method
title_sort risk prediction of chronic diseases with a two-stage semi-supervised clustering method
topic Regular Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9929447/
https://www.ncbi.nlm.nih.gov/pubmed/36816765
http://dx.doi.org/10.1016/j.pmedr.2023.102129
work_keys_str_mv AT maozaixing riskpredictionofchronicdiseaseswithatwostagesemisupervisedclusteringmethod
AT fukumayasufumi riskpredictionofchronicdiseaseswithatwostagesemisupervisedclusteringmethod
AT tsukadahisashi riskpredictionofchronicdiseaseswithatwostagesemisupervisedclusteringmethod
AT wadasatoshi riskpredictionofchronicdiseaseswithatwostagesemisupervisedclusteringmethod