Cargando…

Missing data in bioarchaeology II: A test of ordinal and continuous data imputation

OBJECTIVES: Previous research has shown that while missing data are common in bioarchaeological studies, they are seldom handled using statistically rigorous methods. The primary objective of this article is to evaluate the ability of imputation to manage missing data and encourage the use of advanc...

Descripción completa

Detalles Bibliográficos
Autores principales: Wissler, Amanda, Blevins, Kelly E., Buikstra, Jane E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley & Sons, Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825894/
https://www.ncbi.nlm.nih.gov/pubmed/36790608
http://dx.doi.org/10.1002/ajpa.24614
_version_ 1784866722990260224
author Wissler, Amanda
Blevins, Kelly E.
Buikstra, Jane E.
author_facet Wissler, Amanda
Blevins, Kelly E.
Buikstra, Jane E.
author_sort Wissler, Amanda
collection PubMed
description OBJECTIVES: Previous research has shown that while missing data are common in bioarchaeological studies, they are seldom handled using statistically rigorous methods. The primary objective of this article is to evaluate the ability of imputation to manage missing data and encourage the use of advanced statistical methods in bioarchaeology and paleopathology. An overview of missing data management in biological anthropology is provided, followed by a test of imputation and deletion methods for handling missing data. MATERIALS AND METHODS: Missing data were simulated on complete datasets of ordinal (n = 287) and continuous (n = 369) bioarchaeological data. Missing values were imputed using five imputation methods (mean, predictive mean matching, random forest, expectation maximization, and stochastic regression) and the success of each at obtaining the parameters of the original dataset compared with pairwise and listwise deletion. RESULTS: In all instances, listwise deletion was least successful at approximating the original parameters. Imputation of continuous data was more effective than ordinal data. Overall, no one method performed best and the amount of missing data proved a stronger predictor of imputation success. DISCUSSION: These findings support the use of imputation methods over deletion for handling missing bioarchaeological and paleopathology data, especially when the data are continuous. Whereas deletion methods reduce sample size, imputation maintains sample size, improving statistical power and preventing bias from being introduced into the dataset.
format Online
Article
Text
id pubmed-9825894
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley & Sons, Inc.
record_format MEDLINE/PubMed
spelling pubmed-98258942023-01-09 Missing data in bioarchaeology II: A test of ordinal and continuous data imputation Wissler, Amanda Blevins, Kelly E. Buikstra, Jane E. Am J Biol Anthropol Research Articles OBJECTIVES: Previous research has shown that while missing data are common in bioarchaeological studies, they are seldom handled using statistically rigorous methods. The primary objective of this article is to evaluate the ability of imputation to manage missing data and encourage the use of advanced statistical methods in bioarchaeology and paleopathology. An overview of missing data management in biological anthropology is provided, followed by a test of imputation and deletion methods for handling missing data. MATERIALS AND METHODS: Missing data were simulated on complete datasets of ordinal (n = 287) and continuous (n = 369) bioarchaeological data. Missing values were imputed using five imputation methods (mean, predictive mean matching, random forest, expectation maximization, and stochastic regression) and the success of each at obtaining the parameters of the original dataset compared with pairwise and listwise deletion. RESULTS: In all instances, listwise deletion was least successful at approximating the original parameters. Imputation of continuous data was more effective than ordinal data. Overall, no one method performed best and the amount of missing data proved a stronger predictor of imputation success. DISCUSSION: These findings support the use of imputation methods over deletion for handling missing bioarchaeological and paleopathology data, especially when the data are continuous. Whereas deletion methods reduce sample size, imputation maintains sample size, improving statistical power and preventing bias from being introduced into the dataset. John Wiley & Sons, Inc. 2022-09-12 2022-11 /pmc/articles/PMC9825894/ /pubmed/36790608 http://dx.doi.org/10.1002/ajpa.24614 Text en © 2022 The Authors. American Journal of Biological Anthropology published by Wiley Periodicals LLC. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Wissler, Amanda
Blevins, Kelly E.
Buikstra, Jane E.
Missing data in bioarchaeology II: A test of ordinal and continuous data imputation
title Missing data in bioarchaeology II: A test of ordinal and continuous data imputation
title_full Missing data in bioarchaeology II: A test of ordinal and continuous data imputation
title_fullStr Missing data in bioarchaeology II: A test of ordinal and continuous data imputation
title_full_unstemmed Missing data in bioarchaeology II: A test of ordinal and continuous data imputation
title_short Missing data in bioarchaeology II: A test of ordinal and continuous data imputation
title_sort missing data in bioarchaeology ii: a test of ordinal and continuous data imputation
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825894/
https://www.ncbi.nlm.nih.gov/pubmed/36790608
http://dx.doi.org/10.1002/ajpa.24614
work_keys_str_mv AT wissleramanda missingdatainbioarchaeologyiiatestofordinalandcontinuousdataimputation
AT blevinskellye missingdatainbioarchaeologyiiatestofordinalandcontinuousdataimputation
AT buikstrajanee missingdatainbioarchaeologyiiatestofordinalandcontinuousdataimputation