Cargando…
Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data?
BACKGROUND: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/AIDS patients when mi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Tehran University of Medical Sciences
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3481660/ https://www.ncbi.nlm.nih.gov/pubmed/23113124 |
_version_ | 1782247765115404288 |
---|---|
author | Baneshi, MR Faramarzi, H Marzban, M |
author_facet | Baneshi, MR Faramarzi, H Marzban, M |
author_sort | Baneshi, MR |
collection | PubMed |
description | BACKGROUND: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/AIDS patients when missing data exist. METHODS: HIV patients (n=1460) referred to Voluntary Consoling and Testing Center (VCT) of Shiraz southern Iran during 2004–2009 were recruited. Univariate association between variables and death was assessed. Only variables which had univariate P< 0.25 were selected to be offered to the Multifactorial models. First, patients with missing data on candidate variables were deleted (C-C model). Then, applying Multivariable Imputation via Chained Equations (MICE), missing data were imputed. Logistic regression was fitted to C-C and imputed data sets (MICE model). Models were compared in terms of number of variables retained in the final model, width of confidence intervals, and discrimination ability. RESULT: About 22% of data were lost in C-C model. Number of variables retained in the C-C and MICE models was 2 and 6 respectively. Confidence Intervals (C.I.) corresponding to C-C model was wider than that of MICE. The MICE model showed greater discrimination ability than C-C model (70% versus 64%). CONCLUSION: The C-C analysis resulted to loss of power and wide CI's. Once missing data were imputed, more variables reached significance level and C.I.'s were narrower. Therefore, we do recommend the application of the imputation method for handling missing data. |
format | Online Article Text |
id | pubmed-3481660 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Tehran University of Medical Sciences |
record_format | MEDLINE/PubMed |
spelling | pubmed-34816602012-10-30 Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data? Baneshi, MR Faramarzi, H Marzban, M Iran J Public Health Original Article BACKGROUND: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/AIDS patients when missing data exist. METHODS: HIV patients (n=1460) referred to Voluntary Consoling and Testing Center (VCT) of Shiraz southern Iran during 2004–2009 were recruited. Univariate association between variables and death was assessed. Only variables which had univariate P< 0.25 were selected to be offered to the Multifactorial models. First, patients with missing data on candidate variables were deleted (C-C model). Then, applying Multivariable Imputation via Chained Equations (MICE), missing data were imputed. Logistic regression was fitted to C-C and imputed data sets (MICE model). Models were compared in terms of number of variables retained in the final model, width of confidence intervals, and discrimination ability. RESULT: About 22% of data were lost in C-C model. Number of variables retained in the C-C and MICE models was 2 and 6 respectively. Confidence Intervals (C.I.) corresponding to C-C model was wider than that of MICE. The MICE model showed greater discrimination ability than C-C model (70% versus 64%). CONCLUSION: The C-C analysis resulted to loss of power and wide CI's. Once missing data were imputed, more variables reached significance level and C.I.'s were narrower. Therefore, we do recommend the application of the imputation method for handling missing data. Tehran University of Medical Sciences 2012-01-31 /pmc/articles/PMC3481660/ /pubmed/23113124 Text en Copyright © Iranian Public Health Association & Tehran University of Medical Sciences http://creativecommons.org/licenses/by-nc/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution NonCommercial 3.0 License (CC BY-NC 3.0), which allows users to read, copy, distribute and make derivative works for non-commercial purposes from the material, as long as the author of the original work is cited properly. |
spellingShingle | Original Article Baneshi, MR Faramarzi, H Marzban, M Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data? |
title | Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data? |
title_full | Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data? |
title_fullStr | Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data? |
title_full_unstemmed | Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data? |
title_short | Prevention of Disease Complications through Diagnostic Models: How to Tackle the Problem of Missing Data? |
title_sort | prevention of disease complications through diagnostic models: how to tackle the problem of missing data? |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3481660/ https://www.ncbi.nlm.nih.gov/pubmed/23113124 |
work_keys_str_mv | AT baneshimr preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata AT faramarzih preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata AT marzbanm preventionofdiseasecomplicationsthroughdiagnosticmodelshowtotackletheproblemofmissingdata |