Cargando…
Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: Methodological approach and data‐based evaluation
We investigate calibration and assessment of predictive rules when missing values are present in the predictors. Our paper has two key objectives. The first is to investigate how the calibration of the prediction rule can be combined with use of multiple imputation to account for missing predictor o...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7217034/ https://www.ncbi.nlm.nih.gov/pubmed/32052492 http://dx.doi.org/10.1002/bimj.201800289 |
_version_ | 1783532534636740608 |
---|---|
author | Mertens, Bart J. A. Banzato, Erika de Wreede, Liesbeth C. |
author_facet | Mertens, Bart J. A. Banzato, Erika de Wreede, Liesbeth C. |
author_sort | Mertens, Bart J. A. |
collection | PubMed |
description | We investigate calibration and assessment of predictive rules when missing values are present in the predictors. Our paper has two key objectives. The first is to investigate how the calibration of the prediction rule can be combined with use of multiple imputation to account for missing predictor observations. The second objective is to propose such methods that can be implemented with current multiple imputation software, while allowing for unbiased predictive assessment through validation on new observations for which outcome is not yet available. We commence with a review of the methodological foundations of multiple imputation as a model estimation approach as opposed to a purely algorithmic description. We specifically contrast application of multiple imputation for parameter (effect) estimation with predictive calibration. Based on this review, two approaches are formulated, of which the second utilizes application of the classical Rubin's rules for parameter estimation, while the first approach averages probabilities from models fitted on single imputations to directly approximate the predictive density for future observations. We present implementations using current software that allow for validation and estimation of performance measures by cross‐validation, as well as imputation of missing data in predictors on the future data where outcome is missing by definition. To simplify, we restrict discussion to binary outcome and logistic regression throughout. Method performance is verified through application on two real data sets. Accuracy (Brier score) and variance of predicted probabilities are investigated. Results show substantial reductions in variation of calibrated probabilities when using the first approach. |
format | Online Article Text |
id | pubmed-7217034 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-72170342020-05-13 Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: Methodological approach and data‐based evaluation Mertens, Bart J. A. Banzato, Erika de Wreede, Liesbeth C. Biom J Research Papers We investigate calibration and assessment of predictive rules when missing values are present in the predictors. Our paper has two key objectives. The first is to investigate how the calibration of the prediction rule can be combined with use of multiple imputation to account for missing predictor observations. The second objective is to propose such methods that can be implemented with current multiple imputation software, while allowing for unbiased predictive assessment through validation on new observations for which outcome is not yet available. We commence with a review of the methodological foundations of multiple imputation as a model estimation approach as opposed to a purely algorithmic description. We specifically contrast application of multiple imputation for parameter (effect) estimation with predictive calibration. Based on this review, two approaches are formulated, of which the second utilizes application of the classical Rubin's rules for parameter estimation, while the first approach averages probabilities from models fitted on single imputations to directly approximate the predictive density for future observations. We present implementations using current software that allow for validation and estimation of performance measures by cross‐validation, as well as imputation of missing data in predictors on the future data where outcome is missing by definition. To simplify, we restrict discussion to binary outcome and logistic regression throughout. Method performance is verified through application on two real data sets. Accuracy (Brier score) and variance of predicted probabilities are investigated. Results show substantial reductions in variation of calibrated probabilities when using the first approach. John Wiley and Sons Inc. 2020-02-13 2020-05 /pmc/articles/PMC7217034/ /pubmed/32052492 http://dx.doi.org/10.1002/bimj.201800289 Text en © 2020 The Authors. Biometrical Journal published by WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Papers Mertens, Bart J. A. Banzato, Erika de Wreede, Liesbeth C. Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: Methodological approach and data‐based evaluation |
title | Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: Methodological approach and data‐based evaluation |
title_full | Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: Methodological approach and data‐based evaluation |
title_fullStr | Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: Methodological approach and data‐based evaluation |
title_full_unstemmed | Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: Methodological approach and data‐based evaluation |
title_short | Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: Methodological approach and data‐based evaluation |
title_sort | construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation and cross‐validation: methodological approach and data‐based evaluation |
topic | Research Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7217034/ https://www.ncbi.nlm.nih.gov/pubmed/32052492 http://dx.doi.org/10.1002/bimj.201800289 |
work_keys_str_mv | AT mertensbartja constructionandassessmentofpredictionrulesforbinaryoutcomeinthepresenceofmissingpredictordatausingmultipleimputationandcrossvalidationmethodologicalapproachanddatabasedevaluation AT banzatoerika constructionandassessmentofpredictionrulesforbinaryoutcomeinthepresenceofmissingpredictordatausingmultipleimputationandcrossvalidationmethodologicalapproachanddatabasedevaluation AT dewreedeliesbethc constructionandassessmentofpredictionrulesforbinaryoutcomeinthepresenceofmissingpredictordatausingmultipleimputationandcrossvalidationmethodologicalapproachanddatabasedevaluation |