Cargando…
Predicting Depression in Community Dwellers Using a Machine Learning Algorithm
Depression is one of the leading causes of disability worldwide. Given the socioeconomic burden of depression, appropriate depression screening for community dwellers is necessary. We used data from the 2014 and 2016 Korea National Health and Nutrition Examination Surveys. The 2014 dataset was used...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8394838/ https://www.ncbi.nlm.nih.gov/pubmed/34441363 http://dx.doi.org/10.3390/diagnostics11081429 |
_version_ | 1783744037681889280 |
---|---|
author | Cho, Seo-Eun Geem, Zong Woo Na, Kyoung-Sae |
author_facet | Cho, Seo-Eun Geem, Zong Woo Na, Kyoung-Sae |
author_sort | Cho, Seo-Eun |
collection | PubMed |
description | Depression is one of the leading causes of disability worldwide. Given the socioeconomic burden of depression, appropriate depression screening for community dwellers is necessary. We used data from the 2014 and 2016 Korea National Health and Nutrition Examination Surveys. The 2014 dataset was used as a training set, whereas the 2016 dataset was used as the hold-out test set. The synthetic minority oversampling technique (SMOTE) was used to control for class imbalances between the depression and non-depression groups in the 2014 dataset. The least absolute shrinkage and selection operator (LASSO) was used for feature reduction and classifiers in the final model. Data obtained from 9488 participants were used for the machine learning process. The depression group had poorer socioeconomic, health, functional, and biological measures than the non-depression group. From the initial 37 variables, 13 were selected using LASSO. All performance measures were calculated based on the raw 2016 dataset without the SMOTE. The area under the receiver operating characteristic curve and overall accuracy in the hold-out test set were 0.903 and 0.828, respectively. Perceived stress had the strongest influence on the classifying model for depression. LASSO can be practically applied for depression screening of community dwellers with a few variables. Future studies are needed to develop a more efficient and accurate classification model for depression. |
format | Online Article Text |
id | pubmed-8394838 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-83948382021-08-28 Predicting Depression in Community Dwellers Using a Machine Learning Algorithm Cho, Seo-Eun Geem, Zong Woo Na, Kyoung-Sae Diagnostics (Basel) Article Depression is one of the leading causes of disability worldwide. Given the socioeconomic burden of depression, appropriate depression screening for community dwellers is necessary. We used data from the 2014 and 2016 Korea National Health and Nutrition Examination Surveys. The 2014 dataset was used as a training set, whereas the 2016 dataset was used as the hold-out test set. The synthetic minority oversampling technique (SMOTE) was used to control for class imbalances between the depression and non-depression groups in the 2014 dataset. The least absolute shrinkage and selection operator (LASSO) was used for feature reduction and classifiers in the final model. Data obtained from 9488 participants were used for the machine learning process. The depression group had poorer socioeconomic, health, functional, and biological measures than the non-depression group. From the initial 37 variables, 13 were selected using LASSO. All performance measures were calculated based on the raw 2016 dataset without the SMOTE. The area under the receiver operating characteristic curve and overall accuracy in the hold-out test set were 0.903 and 0.828, respectively. Perceived stress had the strongest influence on the classifying model for depression. LASSO can be practically applied for depression screening of community dwellers with a few variables. Future studies are needed to develop a more efficient and accurate classification model for depression. MDPI 2021-08-07 /pmc/articles/PMC8394838/ /pubmed/34441363 http://dx.doi.org/10.3390/diagnostics11081429 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Cho, Seo-Eun Geem, Zong Woo Na, Kyoung-Sae Predicting Depression in Community Dwellers Using a Machine Learning Algorithm |
title | Predicting Depression in Community Dwellers Using a Machine Learning Algorithm |
title_full | Predicting Depression in Community Dwellers Using a Machine Learning Algorithm |
title_fullStr | Predicting Depression in Community Dwellers Using a Machine Learning Algorithm |
title_full_unstemmed | Predicting Depression in Community Dwellers Using a Machine Learning Algorithm |
title_short | Predicting Depression in Community Dwellers Using a Machine Learning Algorithm |
title_sort | predicting depression in community dwellers using a machine learning algorithm |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8394838/ https://www.ncbi.nlm.nih.gov/pubmed/34441363 http://dx.doi.org/10.3390/diagnostics11081429 |
work_keys_str_mv | AT choseoeun predictingdepressionincommunitydwellersusingamachinelearningalgorithm AT geemzongwoo predictingdepressionincommunitydwellersusingamachinelearningalgorithm AT nakyoungsae predictingdepressionincommunitydwellersusingamachinelearningalgorithm |