Cargando…

Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study

The purpose of this study was to verify the usefulness of machine learning (ML) for selection of risk factors and development of predictive models for patients with sarcopenia. We collected medical records from Korean postmenopausal women based on Korea National Health and Nutrition Examination Surv...

Descripción completa

Detalles Bibliográficos
Autores principales: Kang, Yang-Jae, Yoo, Jun-Il, Ha, Yong-chan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wolters Kluwer Health 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824801/
https://www.ncbi.nlm.nih.gov/pubmed/31651901
http://dx.doi.org/10.1097/MD.0000000000017699
_version_ 1783464803297132544
author Kang, Yang-Jae
Yoo, Jun-Il
Ha, Yong-chan
author_facet Kang, Yang-Jae
Yoo, Jun-Il
Ha, Yong-chan
author_sort Kang, Yang-Jae
collection PubMed
description The purpose of this study was to verify the usefulness of machine learning (ML) for selection of risk factors and development of predictive models for patients with sarcopenia. We collected medical records from Korean postmenopausal women based on Korea National Health and Nutrition Examination Surveys. A training data set compiled from simple survey data was used to construct models based on popular ML algorithms (e.g., support vector machine, random forest [RF], and logistic regression). A total of 4020 patients ≥65 years of age were enrolled in this study. The study population consisted of 1698 (42.2%) male and 2322 (57.8%) female patients. The 10 most important risk factors in men were body mass index (BMI), red blood cell (RBC) count, blood urea nitrogen (BUN), vitamin D, ferritin, fiber intake (g/d), primary diastolic blood pressure, white blood cell (WBC) count, fat intake (g/d), age, glutamic-pyruvic transaminase, niacin intake (mg/d), protein intake (g/d), fasting blood sugar, and water intake (g/d). The 10 most important risk factors in women were BMI, water intake (g/d), WBC, RBC count, iron intake (mg/d), BUN, high-density lipoprotein, protein intake (g/d), fiber consumption (g/d), vitamin C intake (mg/d), parathyroid hormone, niacin intake (mg/d), carotene intake (μg/d), potassium intake (mg/d), calcium intake (mg/d), sodium intake (mg/d), retinol intake (μg/d), and age. A receiver operating characteristic (ROC) curve analysis found that the area under the ROC curve for each ML model was not significantly different within a gender. The most cost-effective method in clinical practice is to make feature selection using RF models and expert knowledge and to make disease prediction using verification by several ML models. However, the developed prediction model should be validated using additional studies.
format Online
Article
Text
id pubmed-6824801
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Wolters Kluwer Health
record_format MEDLINE/PubMed
spelling pubmed-68248012019-11-19 Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study Kang, Yang-Jae Yoo, Jun-Il Ha, Yong-chan Medicine (Baltimore) 6400 The purpose of this study was to verify the usefulness of machine learning (ML) for selection of risk factors and development of predictive models for patients with sarcopenia. We collected medical records from Korean postmenopausal women based on Korea National Health and Nutrition Examination Surveys. A training data set compiled from simple survey data was used to construct models based on popular ML algorithms (e.g., support vector machine, random forest [RF], and logistic regression). A total of 4020 patients ≥65 years of age were enrolled in this study. The study population consisted of 1698 (42.2%) male and 2322 (57.8%) female patients. The 10 most important risk factors in men were body mass index (BMI), red blood cell (RBC) count, blood urea nitrogen (BUN), vitamin D, ferritin, fiber intake (g/d), primary diastolic blood pressure, white blood cell (WBC) count, fat intake (g/d), age, glutamic-pyruvic transaminase, niacin intake (mg/d), protein intake (g/d), fasting blood sugar, and water intake (g/d). The 10 most important risk factors in women were BMI, water intake (g/d), WBC, RBC count, iron intake (mg/d), BUN, high-density lipoprotein, protein intake (g/d), fiber consumption (g/d), vitamin C intake (mg/d), parathyroid hormone, niacin intake (mg/d), carotene intake (μg/d), potassium intake (mg/d), calcium intake (mg/d), sodium intake (mg/d), retinol intake (μg/d), and age. A receiver operating characteristic (ROC) curve analysis found that the area under the ROC curve for each ML model was not significantly different within a gender. The most cost-effective method in clinical practice is to make feature selection using RF models and expert knowledge and to make disease prediction using verification by several ML models. However, the developed prediction model should be validated using additional studies. Wolters Kluwer Health 2019-10-25 /pmc/articles/PMC6824801/ /pubmed/31651901 http://dx.doi.org/10.1097/MD.0000000000017699 Text en Copyright © 2019 the Author(s). Published by Wolters Kluwer Health, Inc. http://creativecommons.org/licenses/by-nc/4.0 This is an open access article distributed under the terms of the Creative Commons Attribution-Non Commercial License 4.0 (CCBY-NC), where it is permissible to download, share, remix, transform, and buildup the work provided it is properly cited. The work cannot be used commercially without permission from the journal. http://creativecommons.org/licenses/by-nc/4.0
spellingShingle 6400
Kang, Yang-Jae
Yoo, Jun-Il
Ha, Yong-chan
Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study
title Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study
title_full Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study
title_fullStr Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study
title_full_unstemmed Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study
title_short Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study
title_sort sarcopenia feature selection and risk prediction using machine learning: a cross-sectional study
topic 6400
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824801/
https://www.ncbi.nlm.nih.gov/pubmed/31651901
http://dx.doi.org/10.1097/MD.0000000000017699
work_keys_str_mv AT kangyangjae sarcopeniafeatureselectionandriskpredictionusingmachinelearningacrosssectionalstudy
AT yoojunil sarcopeniafeatureselectionandriskpredictionusingmachinelearningacrosssectionalstudy
AT hayongchan sarcopeniafeatureselectionandriskpredictionusingmachinelearningacrosssectionalstudy