Cargando…

Whole genome sequence analysis of the simulated systolic blood pressure in Genetic Analysis Workshop 18 family data: long-term average and collapsing methods

Analysis of longitudinal family data is challenging because of 2 sources of correlations: correlations across longitudinal measurements and correlations among related individuals. We investigated whether analysis using long-term average (average of all 3 visits) can enhance gene discovery compared w...

Descripción completa

Detalles Bibliográficos
Autores principales: Sung, Yun Ju, Basson, Jacob, Rao, Dabeeru C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4143632/
https://www.ncbi.nlm.nih.gov/pubmed/25519365
http://dx.doi.org/10.1186/1753-6561-8-S1-S12
Descripción
Sumario:Analysis of longitudinal family data is challenging because of 2 sources of correlations: correlations across longitudinal measurements and correlations among related individuals. We investigated whether analysis using long-term average (average of all 3 visits) can enhance gene discovery compared with a single-visit analysis. We analyzed all 200 replicates of simulated systolic blood pressure (SBP) in Genetic Analysis Workshop 18 (GAW18) family data using both single-marker and collapsing methods. We considered 2 collapsing approaches: collapsing all variants and collapsing low-frequency variants. Analysis using long-term average performed slightly better than SBP measured at a single visit. Collapsing all variants performed much better than collapsing low-frequency variants at MAP4 and FLNB, which included a common variant with a relatively large effect. For several variants in gene MAP4, single-marker analysis also provided high power. In contrast, collapsing only low-frequency variants performed much better for SCAP, DNASE1L3, and LOC152217, where rare variants in these genes had larger effect than common variants. However, for other causal variants, all approaches provided disappointingly poor performance. This poor performance appeared to occur because most of these causal variants explained a very small fraction of phenotypic variance. We also found that collapsing multiple variants did worse than single-marker analysis for several genes when they contained causal single-nucleotide polymorphisms (SNPs) with both positive and negative effects. Because half of causal SNPs were not found in the annotation file based on the 1000 Genomes Project, we found that power was also affected by our use of incomplete annotation information.