Cargando…

The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS

OBJECTIVE: To explore the optimal fitting path of missing data of the Scale to make the fitting data close to the real situation of patients' data. METHODS: Based on the complete data set of the SDS of 507 patients with stroke, the data simulation sets of Missing Completely at Random (MCAR), Mi...

Descripción completa

Detalles Bibliográficos
Autores principales: Lv, Xiaoying, Zhao, Ruonan, Su, Tongsheng, He, Liyun, Song, Rui, Wang, Qizhen, Yu, Xueyun, Zhu, Yanbo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8776472/
https://www.ncbi.nlm.nih.gov/pubmed/35069761
http://dx.doi.org/10.1155/2022/5630748
_version_ 1784636844318654464
author Lv, Xiaoying
Zhao, Ruonan
Su, Tongsheng
He, Liyun
Song, Rui
Wang, Qizhen
Yu, Xueyun
Zhu, Yanbo
author_facet Lv, Xiaoying
Zhao, Ruonan
Su, Tongsheng
He, Liyun
Song, Rui
Wang, Qizhen
Yu, Xueyun
Zhu, Yanbo
author_sort Lv, Xiaoying
collection PubMed
description OBJECTIVE: To explore the optimal fitting path of missing data of the Scale to make the fitting data close to the real situation of patients' data. METHODS: Based on the complete data set of the SDS of 507 patients with stroke, the data simulation sets of Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR) were constructed by R software, respectively, with missing rates of 5%, 10%, 15%, 20%, 25%, 30%, 35%, and 40% under three missing mechanisms. Mean substitution (MS), random forest regression (RFR), and predictive mean matching (PMM) were used to fit the data. Root mean square error (RMSE), the width of 95% confidence intervals (95% CI), and Spearman correlation coefficient (SCC) were used to evaluate the fitting effect and determine the optimal fitting path. RESULTS: when dealing with the problem of missing data in scales, the optimal fitting path is ① under the MCAR deletion mechanism, when the deletion proportion is less than 20%, the MS method is the most convenient; when the missing ratio is greater than 20%, RFR algorithm is the best fitting method. ② Under the Mar mechanism, when the deletion ratio is less than 35%, the MS method is the most convenient. When the deletion ratio is greater than 35%, RFR has a better correlation. ③ Under the mechanism of MNAR, RFR is the best data fitting method, especially when the missing proportion is greater than 30%. In reality, when the deletion ratio is small, the complete case deletion method is the most commonly used, but the RFR algorithm can greatly expand the application scope of samples and save the cost of clinical research when the deletion ratio is less than 30%. The best way to deal with data missing should be based on the missing mechanism and proportion of actual data, and choose the best method between the statistical analysis ability of the research team, the effectiveness of the method, and the understanding of readers.
format Online
Article
Text
id pubmed-8776472
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-87764722022-01-21 The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS Lv, Xiaoying Zhao, Ruonan Su, Tongsheng He, Liyun Song, Rui Wang, Qizhen Yu, Xueyun Zhu, Yanbo Evid Based Complement Alternat Med Research Article OBJECTIVE: To explore the optimal fitting path of missing data of the Scale to make the fitting data close to the real situation of patients' data. METHODS: Based on the complete data set of the SDS of 507 patients with stroke, the data simulation sets of Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR) were constructed by R software, respectively, with missing rates of 5%, 10%, 15%, 20%, 25%, 30%, 35%, and 40% under three missing mechanisms. Mean substitution (MS), random forest regression (RFR), and predictive mean matching (PMM) were used to fit the data. Root mean square error (RMSE), the width of 95% confidence intervals (95% CI), and Spearman correlation coefficient (SCC) were used to evaluate the fitting effect and determine the optimal fitting path. RESULTS: when dealing with the problem of missing data in scales, the optimal fitting path is ① under the MCAR deletion mechanism, when the deletion proportion is less than 20%, the MS method is the most convenient; when the missing ratio is greater than 20%, RFR algorithm is the best fitting method. ② Under the Mar mechanism, when the deletion ratio is less than 35%, the MS method is the most convenient. When the deletion ratio is greater than 35%, RFR has a better correlation. ③ Under the mechanism of MNAR, RFR is the best data fitting method, especially when the missing proportion is greater than 30%. In reality, when the deletion ratio is small, the complete case deletion method is the most commonly used, but the RFR algorithm can greatly expand the application scope of samples and save the cost of clinical research when the deletion ratio is less than 30%. The best way to deal with data missing should be based on the missing mechanism and proportion of actual data, and choose the best method between the statistical analysis ability of the research team, the effectiveness of the method, and the understanding of readers. Hindawi 2022-01-13 /pmc/articles/PMC8776472/ /pubmed/35069761 http://dx.doi.org/10.1155/2022/5630748 Text en Copyright © 2022 Xiaoying Lv et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Lv, Xiaoying
Zhao, Ruonan
Su, Tongsheng
He, Liyun
Song, Rui
Wang, Qizhen
Yu, Xueyun
Zhu, Yanbo
The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS
title The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS
title_full The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS
title_fullStr The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS
title_full_unstemmed The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS
title_short The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS
title_sort fitting optimization path analysis on scale missing data: based on the 507 patients of poststroke depression measured by sds
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8776472/
https://www.ncbi.nlm.nih.gov/pubmed/35069761
http://dx.doi.org/10.1155/2022/5630748
work_keys_str_mv AT lvxiaoying thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT zhaoruonan thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT sutongsheng thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT heliyun thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT songrui thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT wangqizhen thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT yuxueyun thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT zhuyanbo thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT lvxiaoying fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT zhaoruonan fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT sutongsheng fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT heliyun fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT songrui fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT wangqizhen fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT yuxueyun fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds
AT zhuyanbo fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds