Cargando…
The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS
OBJECTIVE: To explore the optimal fitting path of missing data of the Scale to make the fitting data close to the real situation of patients' data. METHODS: Based on the complete data set of the SDS of 507 patients with stroke, the data simulation sets of Missing Completely at Random (MCAR), Mi...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8776472/ https://www.ncbi.nlm.nih.gov/pubmed/35069761 http://dx.doi.org/10.1155/2022/5630748 |
_version_ | 1784636844318654464 |
---|---|
author | Lv, Xiaoying Zhao, Ruonan Su, Tongsheng He, Liyun Song, Rui Wang, Qizhen Yu, Xueyun Zhu, Yanbo |
author_facet | Lv, Xiaoying Zhao, Ruonan Su, Tongsheng He, Liyun Song, Rui Wang, Qizhen Yu, Xueyun Zhu, Yanbo |
author_sort | Lv, Xiaoying |
collection | PubMed |
description | OBJECTIVE: To explore the optimal fitting path of missing data of the Scale to make the fitting data close to the real situation of patients' data. METHODS: Based on the complete data set of the SDS of 507 patients with stroke, the data simulation sets of Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR) were constructed by R software, respectively, with missing rates of 5%, 10%, 15%, 20%, 25%, 30%, 35%, and 40% under three missing mechanisms. Mean substitution (MS), random forest regression (RFR), and predictive mean matching (PMM) were used to fit the data. Root mean square error (RMSE), the width of 95% confidence intervals (95% CI), and Spearman correlation coefficient (SCC) were used to evaluate the fitting effect and determine the optimal fitting path. RESULTS: when dealing with the problem of missing data in scales, the optimal fitting path is ① under the MCAR deletion mechanism, when the deletion proportion is less than 20%, the MS method is the most convenient; when the missing ratio is greater than 20%, RFR algorithm is the best fitting method. ② Under the Mar mechanism, when the deletion ratio is less than 35%, the MS method is the most convenient. When the deletion ratio is greater than 35%, RFR has a better correlation. ③ Under the mechanism of MNAR, RFR is the best data fitting method, especially when the missing proportion is greater than 30%. In reality, when the deletion ratio is small, the complete case deletion method is the most commonly used, but the RFR algorithm can greatly expand the application scope of samples and save the cost of clinical research when the deletion ratio is less than 30%. The best way to deal with data missing should be based on the missing mechanism and proportion of actual data, and choose the best method between the statistical analysis ability of the research team, the effectiveness of the method, and the understanding of readers. |
format | Online Article Text |
id | pubmed-8776472 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-87764722022-01-21 The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS Lv, Xiaoying Zhao, Ruonan Su, Tongsheng He, Liyun Song, Rui Wang, Qizhen Yu, Xueyun Zhu, Yanbo Evid Based Complement Alternat Med Research Article OBJECTIVE: To explore the optimal fitting path of missing data of the Scale to make the fitting data close to the real situation of patients' data. METHODS: Based on the complete data set of the SDS of 507 patients with stroke, the data simulation sets of Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR) were constructed by R software, respectively, with missing rates of 5%, 10%, 15%, 20%, 25%, 30%, 35%, and 40% under three missing mechanisms. Mean substitution (MS), random forest regression (RFR), and predictive mean matching (PMM) were used to fit the data. Root mean square error (RMSE), the width of 95% confidence intervals (95% CI), and Spearman correlation coefficient (SCC) were used to evaluate the fitting effect and determine the optimal fitting path. RESULTS: when dealing with the problem of missing data in scales, the optimal fitting path is ① under the MCAR deletion mechanism, when the deletion proportion is less than 20%, the MS method is the most convenient; when the missing ratio is greater than 20%, RFR algorithm is the best fitting method. ② Under the Mar mechanism, when the deletion ratio is less than 35%, the MS method is the most convenient. When the deletion ratio is greater than 35%, RFR has a better correlation. ③ Under the mechanism of MNAR, RFR is the best data fitting method, especially when the missing proportion is greater than 30%. In reality, when the deletion ratio is small, the complete case deletion method is the most commonly used, but the RFR algorithm can greatly expand the application scope of samples and save the cost of clinical research when the deletion ratio is less than 30%. The best way to deal with data missing should be based on the missing mechanism and proportion of actual data, and choose the best method between the statistical analysis ability of the research team, the effectiveness of the method, and the understanding of readers. Hindawi 2022-01-13 /pmc/articles/PMC8776472/ /pubmed/35069761 http://dx.doi.org/10.1155/2022/5630748 Text en Copyright © 2022 Xiaoying Lv et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Lv, Xiaoying Zhao, Ruonan Su, Tongsheng He, Liyun Song, Rui Wang, Qizhen Yu, Xueyun Zhu, Yanbo The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS |
title | The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS |
title_full | The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS |
title_fullStr | The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS |
title_full_unstemmed | The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS |
title_short | The Fitting Optimization Path Analysis on Scale Missing Data: Based on the 507 Patients of Poststroke Depression Measured by SDS |
title_sort | fitting optimization path analysis on scale missing data: based on the 507 patients of poststroke depression measured by sds |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8776472/ https://www.ncbi.nlm.nih.gov/pubmed/35069761 http://dx.doi.org/10.1155/2022/5630748 |
work_keys_str_mv | AT lvxiaoying thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT zhaoruonan thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT sutongsheng thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT heliyun thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT songrui thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT wangqizhen thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT yuxueyun thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT zhuyanbo thefittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT lvxiaoying fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT zhaoruonan fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT sutongsheng fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT heliyun fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT songrui fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT wangqizhen fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT yuxueyun fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds AT zhuyanbo fittingoptimizationpathanalysisonscalemissingdatabasedonthe507patientsofpoststrokedepressionmeasuredbysds |