Cargando…
The ability of different imputation methods for missing values in mental measurement questionnaires
BACKGROUND: Incomplete data are of particular important influence in mental measurement questionnaires. Most experts, however, mostly focus on clinical trials and cohort studies and generally pay less attention to this deficiency. We aim is to compare the accuracy of four common methods for handling...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7045426/ https://www.ncbi.nlm.nih.gov/pubmed/32103723 http://dx.doi.org/10.1186/s12874-020-00932-0 |
Sumario: | BACKGROUND: Incomplete data are of particular important influence in mental measurement questionnaires. Most experts, however, mostly focus on clinical trials and cohort studies and generally pay less attention to this deficiency. We aim is to compare the accuracy of four common methods for handling items missing from different psychology questionnaires according to the items non-response rates. METHOD: All data were drawn from the previous studies including the self-acceptance scale (SAQ), the activities of daily living scale (ADL) and self-esteem scale (RSES). SAQ and ADL dataset, simulation group, were used to compare and assess the ability of four imputation methods which are direct deletion, mode imputation, Hot-deck (HD) imputation and multiple imputation (MI) by absolute deviation, the root mean square error and average relative error in missing proportions of 5, 10, 15 and 20%. RSES dataset, validation group, was used to test the application of imputation methods. All analyses were finished by SAS 9.4. RESULTS: The biases obtained by MI are the smallest under various missing proportions. HD imputation approach performed the lowest absolute deviation of standard deviation values. But they got the similar results and the performances of them are obviously better than direct deletion and mode imputation. In a real world situation, the respondents’ average score in complete data set was 28.22 ± 4.63, which are not much different from imputed datasets. The direction of the influence of the five factors on self-esteem was consistent, although there were some differences in the size and range of OR values in logistic regression model. CONCLUSION: MI shows the best performance while it demands slightly more data analytic capacity and skills of programming. And HD could be considered to impute missing values in psychological investigation when MI cannot be performed due to limited circumstances. |
---|