Cargando…

On the Performance of Multiple Imputation Based on Chained Equations in Tackling Missing Data of the African α(3.7)-Globin Deletion in a Malaria Association Study

Multiple imputation based on chained equations (MICE) is an alternative missing genotype method that can use genetic and nongenetic auxiliary data to inform the imputation process. Previously, MICE was successfully tested on strongly linked genetic data. We have now tested it on data of the HBA2 gen...

Descripción completa

Detalles Bibliográficos
Autores principales: Sepúlveda, Nuno, Manjurano, Alphaxard, Drakeley, Chris, Clark, Taane G
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BlackWell Publishing Ltd 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4140543/
https://www.ncbi.nlm.nih.gov/pubmed/24942080
http://dx.doi.org/10.1111/ahg.12065
_version_ 1782331524880793600
author Sepúlveda, Nuno
Manjurano, Alphaxard
Drakeley, Chris
Clark, Taane G
author_facet Sepúlveda, Nuno
Manjurano, Alphaxard
Drakeley, Chris
Clark, Taane G
author_sort Sepúlveda, Nuno
collection PubMed
description Multiple imputation based on chained equations (MICE) is an alternative missing genotype method that can use genetic and nongenetic auxiliary data to inform the imputation process. Previously, MICE was successfully tested on strongly linked genetic data. We have now tested it on data of the HBA2 gene which, by the experimental design used in a malaria association study in Tanzania, shows a high missing data percentage and is weakly linked with the remaining genetic markers in the data set. We constructed different imputation models and studied their performance under different missing data conditions. Overall, MICE failed to accurately predict the true genotypes. However, using the best imputation model for the data, we obtained unbiased estimates for the genetic effects, and association signals of the HBA2 gene on malaria positivity. When the whole data set was analyzed with the same imputation model, the association signal increased from 0.80 to 2.70 before and after imputation, respectively. Conversely, postimputation estimates for the genetic effects remained the same in relation to the complete case analysis but showed increased precision. We argue that these postimputation estimates are reasonably unbiased, as a result of a good study design based on matching key socio-environmental factors.
format Online
Article
Text
id pubmed-4140543
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BlackWell Publishing Ltd
record_format MEDLINE/PubMed
spelling pubmed-41405432014-09-22 On the Performance of Multiple Imputation Based on Chained Equations in Tackling Missing Data of the African α(3.7)-Globin Deletion in a Malaria Association Study Sepúlveda, Nuno Manjurano, Alphaxard Drakeley, Chris Clark, Taane G Ann Hum Genet Original Articles Multiple imputation based on chained equations (MICE) is an alternative missing genotype method that can use genetic and nongenetic auxiliary data to inform the imputation process. Previously, MICE was successfully tested on strongly linked genetic data. We have now tested it on data of the HBA2 gene which, by the experimental design used in a malaria association study in Tanzania, shows a high missing data percentage and is weakly linked with the remaining genetic markers in the data set. We constructed different imputation models and studied their performance under different missing data conditions. Overall, MICE failed to accurately predict the true genotypes. However, using the best imputation model for the data, we obtained unbiased estimates for the genetic effects, and association signals of the HBA2 gene on malaria positivity. When the whole data set was analyzed with the same imputation model, the association signal increased from 0.80 to 2.70 before and after imputation, respectively. Conversely, postimputation estimates for the genetic effects remained the same in relation to the complete case analysis but showed increased precision. We argue that these postimputation estimates are reasonably unbiased, as a result of a good study design based on matching key socio-environmental factors. BlackWell Publishing Ltd 2014-07 2014-06-18 /pmc/articles/PMC4140543/ /pubmed/24942080 http://dx.doi.org/10.1111/ahg.12065 Text en © 2014 The Authors. Annals of Human Genetics published by John Wiley & Sons Ltd and University College London (UCL). http://creativecommons.org/licenses/by/3.0/ This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Articles
Sepúlveda, Nuno
Manjurano, Alphaxard
Drakeley, Chris
Clark, Taane G
On the Performance of Multiple Imputation Based on Chained Equations in Tackling Missing Data of the African α(3.7)-Globin Deletion in a Malaria Association Study
title On the Performance of Multiple Imputation Based on Chained Equations in Tackling Missing Data of the African α(3.7)-Globin Deletion in a Malaria Association Study
title_full On the Performance of Multiple Imputation Based on Chained Equations in Tackling Missing Data of the African α(3.7)-Globin Deletion in a Malaria Association Study
title_fullStr On the Performance of Multiple Imputation Based on Chained Equations in Tackling Missing Data of the African α(3.7)-Globin Deletion in a Malaria Association Study
title_full_unstemmed On the Performance of Multiple Imputation Based on Chained Equations in Tackling Missing Data of the African α(3.7)-Globin Deletion in a Malaria Association Study
title_short On the Performance of Multiple Imputation Based on Chained Equations in Tackling Missing Data of the African α(3.7)-Globin Deletion in a Malaria Association Study
title_sort on the performance of multiple imputation based on chained equations in tackling missing data of the african α(3.7)-globin deletion in a malaria association study
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4140543/
https://www.ncbi.nlm.nih.gov/pubmed/24942080
http://dx.doi.org/10.1111/ahg.12065
work_keys_str_mv AT sepulvedanuno ontheperformanceofmultipleimputationbasedonchainedequationsintacklingmissingdataoftheafricana37globindeletioninamalariaassociationstudy
AT manjuranoalphaxard ontheperformanceofmultipleimputationbasedonchainedequationsintacklingmissingdataoftheafricana37globindeletioninamalariaassociationstudy
AT drakeleychris ontheperformanceofmultipleimputationbasedonchainedequationsintacklingmissingdataoftheafricana37globindeletioninamalariaassociationstudy
AT clarktaaneg ontheperformanceofmultipleimputationbasedonchainedequationsintacklingmissingdataoftheafricana37globindeletioninamalariaassociationstudy