Cargando…

Evaluation methodology for deep learning imputation models

There is growing interest in imputing missing data in tabular datasets using deep learning. Existing deep learning–based imputation models have been commonly evaluated using root mean square error (RMSE) as the predictive accuracy metric. In this article, we investigate the limitations of assessing...

Descripción completa

Detalles Bibliográficos
Autores principales:	Boursalie, Omar, Samavi, Reza, Doyle, Thomas E.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	SAGE Publications 2022
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9791304/ https://www.ncbi.nlm.nih.gov/pubmed/36562377 http://dx.doi.org/10.1177/15353702221121602

_version_	1784859376562995200
author	Boursalie, Omar Samavi, Reza Doyle, Thomas E.
author_facet	Boursalie, Omar Samavi, Reza Doyle, Thomas E.
author_sort	Boursalie, Omar
collection	PubMed
description	There is growing interest in imputing missing data in tabular datasets using deep learning. Existing deep learning–based imputation models have been commonly evaluated using root mean square error (RMSE) as the predictive accuracy metric. In this article, we investigate the limitations of assessing deep learning–based imputation models by conducting a comparative analysis between RMSE and alternative metrics in the statistical literature including qualitative, predictive accuracy, statistical distance, and descriptive statistics. We design a new aggregated metric, called reconstruction loss (RL), to evaluate deep learning–based imputation models. We also develop and evaluate a novel imputation evaluation methodology based on RL. To minimize model and dataset biases, we use a regression imputation model and two different deep learning imputation models: denoising autoencoders and generative adversarial nets. We also use two tabular datasets from different industry sectors: health care and financial. Our results show that the proposed methodology is effective in evaluating multiple properties of the deep learning–based imputation model’s reconstruction performance.
format	Online Article Text
id	pubmed-9791304
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	SAGE Publications
record_format	MEDLINE/PubMed
spelling	pubmed-97913042022-12-27 Evaluation methodology for deep learning imputation models Boursalie, Omar Samavi, Reza Doyle, Thomas E. Exp Biol Med (Maywood) Original Research There is growing interest in imputing missing data in tabular datasets using deep learning. Existing deep learning–based imputation models have been commonly evaluated using root mean square error (RMSE) as the predictive accuracy metric. In this article, we investigate the limitations of assessing deep learning–based imputation models by conducting a comparative analysis between RMSE and alternative metrics in the statistical literature including qualitative, predictive accuracy, statistical distance, and descriptive statistics. We design a new aggregated metric, called reconstruction loss (RL), to evaluate deep learning–based imputation models. We also develop and evaluate a novel imputation evaluation methodology based on RL. To minimize model and dataset biases, we use a regression imputation model and two different deep learning imputation models: denoising autoencoders and generative adversarial nets. We also use two tabular datasets from different industry sectors: health care and financial. Our results show that the proposed methodology is effective in evaluating multiple properties of the deep learning–based imputation model’s reconstruction performance. SAGE Publications 2022-09-21 2022-11 /pmc/articles/PMC9791304/ /pubmed/36562377 http://dx.doi.org/10.1177/15353702221121602 Text en © 2022 by the Society for Experimental Biology and Medicine https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle	Original Research Boursalie, Omar Samavi, Reza Doyle, Thomas E. Evaluation methodology for deep learning imputation models
title	Evaluation methodology for deep learning imputation models
title_full	Evaluation methodology for deep learning imputation models
title_fullStr	Evaluation methodology for deep learning imputation models
title_full_unstemmed	Evaluation methodology for deep learning imputation models
title_short	Evaluation methodology for deep learning imputation models
title_sort	evaluation methodology for deep learning imputation models
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9791304/ https://www.ncbi.nlm.nih.gov/pubmed/36562377 http://dx.doi.org/10.1177/15353702221121602
work_keys_str_mv	AT boursalieomar evaluationmethodologyfordeeplearningimputationmodels AT samavireza evaluationmethodologyfordeeplearningimputationmodels AT doylethomase evaluationmethodologyfordeeplearningimputationmodels

Evaluation methodology for deep learning imputation models

Ejemplares similares