Cargando…

Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?

BACKGROUND: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/AIDS patients when mi...

Descripción completa

Detalles Bibliográficos
Autores principales: Baneshi, MR, Faramarzi, H, Marzban, M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Tehran University of Medical Sciences 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3481660/
https://www.ncbi.nlm.nih.gov/pubmed/23113124
_version_ 1782247765115404288
author Baneshi, MR
Faramarzi, H
Marzban, M
author_facet Baneshi, MR
Faramarzi, H
Marzban, M
author_sort Baneshi, MR
collection PubMed
description BACKGROUND: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/AIDS patients when missing data exist. METHODS: HIV patients (n=1460) referred to Voluntary Consoling and Testing Center (VCT) of Shiraz southern Iran during 2004–2009 were recruited. Univariate association between variables and death was assessed. Only variables which had univariate P< 0.25 were selected to be offered to the Multifactorial models. First, patients with missing data on candidate variables were deleted (C-C model). Then, applying Multivariable Imputation via Chained Equations (MICE), missing data were imputed. Logistic regression was fitted to C-C and imputed data sets (MICE model). Models were compared in terms of number of variables retained in the final model, width of confidence intervals, and discrimination ability. RESULT: About 22% of data were lost in C-C model. Number of variables retained in the C-C and MICE models was 2 and 6 respectively. Confidence Intervals (C.I.) corresponding to C-C model was wider than that of MICE. The MICE model showed greater discrimination ability than C-C model (70% versus 64%). CONCLUSION: The C-C analysis resulted to loss of power and wide CI's. Once missing data were imputed, more variables reached significance level and C.I.'s were narrower. Therefore, we do recommend the application of the imputation method for handling missing data.
format Online
Article
Text
id pubmed-3481660
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Tehran University of Medical Sciences
record_format MEDLINE/PubMed
spelling pubmed-34816602012-10-30 Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data? Baneshi, MR Faramarzi, H Marzban, M Iran J Public Health Original Article BACKGROUND: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/AIDS patients when missing data exist. METHODS: HIV patients (n=1460) referred to Voluntary Consoling and Testing Center (VCT) of Shiraz southern Iran during 2004–2009 were recruited. Univariate association between variables and death was assessed. Only variables which had univariate P< 0.25 were selected to be offered to the Multifactorial models. First, patients with missing data on candidate variables were deleted (C-C model). Then, applying Multivariable Imputation via Chained Equations (MICE), missing data were imputed. Logistic regression was fitted to C-C and imputed data sets (MICE model). Models were compared in terms of number of variables retained in the final model, width of confidence intervals, and discrimination ability. RESULT: About 22% of data were lost in C-C model. Number of variables retained in the C-C and MICE models was 2 and 6 respectively. Confidence Intervals (C.I.) corresponding to C-C model was wider than that of MICE. The MICE model showed greater discrimination ability than C-C model (70% versus 64%). CONCLUSION: The C-C analysis resulted to loss of power and wide CI's. Once missing data were imputed, more variables reached significance level and C.I.'s were narrower. Therefore, we do recommend the application of the imputation method for handling missing data. Tehran University of Medical Sciences 2012-01-31 /pmc/articles/PMC3481660/ /pubmed/23113124 Text en Copyright © Iranian Public Health Association & Tehran University of Medical Sciences http://creativecommons.org/licenses/by-nc/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution NonCommercial 3.0 License (CC BY-NC 3.0), which allows users to read, copy, distribute and make derivative works for non-commercial purposes from the material, as long as the author of the original work is cited properly.
spellingShingle Original Article
Baneshi, MR
Faramarzi, H
Marzban, M
Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_full Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_fullStr Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_full_unstemmed Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_short Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
title_sort prevention of disease complications through diagnostic models: how to tackle the problem of missing data?
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3481660/
https://www.ncbi.nlm.nih.gov/pubmed/23113124
work_keys_str_mv AT baneshimr preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata
AT faramarzih preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata
AT marzbanm preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata