Cargando…

Efficiency of multiple imputation to test for association in the presence of missing data

The presence of missing data in association studies is an important problem, particularly with high-density single-nucleotide polymorphism (SNP) maps, because the probability that at least one genotype is missing dramatically increases with the number of markers. A possible strategy is to simply ign...

Descripción completa

Detalles Bibliográficos
Autores principales:	Croiseau, Pascal, Bardel, Claire, Génin, Emmanuelle
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367517/ https://www.ncbi.nlm.nih.gov/pubmed/18466521

_version_	1782154310748995584
author	Croiseau, Pascal Bardel, Claire Génin, Emmanuelle
author_facet	Croiseau, Pascal Bardel, Claire Génin, Emmanuelle
author_sort	Croiseau, Pascal
collection	PubMed
description	The presence of missing data in association studies is an important problem, particularly with high-density single-nucleotide polymorphism (SNP) maps, because the probability that at least one genotype is missing dramatically increases with the number of markers. A possible strategy is to simply ignore the missing data and only use the complete observations, and, consequently, to accept a significant decrease of the sample size. Using Genetic Analysis Workshop 15 simulated data on which we removed some genotypes to generate different levels of missing data, we show that this strategy might lead to an important loss in power to detect association, but may also result in false conclusions regarding the most likely susceptibility site if another marker is in linkage disequilibrium with the disease susceptibility site. We propose a multiple imputation approach to deal with missing data on case-parent trios and evaluated the performance of this approach on the same simulated data. We found that our multiple imputation approach has high power to detect association with the susceptibility site even with a large amount of missing data, and can identify the susceptibility sites among a set of sites in linkage disequilibrium.
format	Text
id	pubmed-2367517
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-23675172008-05-06 Efficiency of multiple imputation to test for association in the presence of missing data Croiseau, Pascal Bardel, Claire Génin, Emmanuelle BMC Proc Proceedings The presence of missing data in association studies is an important problem, particularly with high-density single-nucleotide polymorphism (SNP) maps, because the probability that at least one genotype is missing dramatically increases with the number of markers. A possible strategy is to simply ignore the missing data and only use the complete observations, and, consequently, to accept a significant decrease of the sample size. Using Genetic Analysis Workshop 15 simulated data on which we removed some genotypes to generate different levels of missing data, we show that this strategy might lead to an important loss in power to detect association, but may also result in false conclusions regarding the most likely susceptibility site if another marker is in linkage disequilibrium with the disease susceptibility site. We propose a multiple imputation approach to deal with missing data on case-parent trios and evaluated the performance of this approach on the same simulated data. We found that our multiple imputation approach has high power to detect association with the susceptibility site even with a large amount of missing data, and can identify the susceptibility sites among a set of sites in linkage disequilibrium. BioMed Central 2007-12-18 /pmc/articles/PMC2367517/ /pubmed/18466521 Text en Copyright © 2007 Croiseau et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Croiseau, Pascal Bardel, Claire Génin, Emmanuelle Efficiency of multiple imputation to test for association in the presence of missing data
title	Efficiency of multiple imputation to test for association in the presence of missing data
title_full	Efficiency of multiple imputation to test for association in the presence of missing data
title_fullStr	Efficiency of multiple imputation to test for association in the presence of missing data
title_full_unstemmed	Efficiency of multiple imputation to test for association in the presence of missing data
title_short	Efficiency of multiple imputation to test for association in the presence of missing data
title_sort	efficiency of multiple imputation to test for association in the presence of missing data
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367517/ https://www.ncbi.nlm.nih.gov/pubmed/18466521
work_keys_str_mv	AT croiseaupascal efficiencyofmultipleimputationtotestforassociationinthepresenceofmissingdata AT bardelclaire efficiencyofmultipleimputationtotestforassociationinthepresenceofmissingdata AT geninemmanuelle efficiencyofmultipleimputationtotestforassociationinthepresenceofmissingdata

Efficiency of multiple imputation to test for association in the presence of missing data

Ejemplares similares