Cargando…
Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling
Methods that can evaluate aggregate effects of rare and common variants are limited. Therefore, we applied a two-stage approach to evaluate aggregate gene effects in the 1000 Genomes Project data, which contain 24,487 single-nucleotide polymorphisms (SNPs) in 697 unrelated individuals from 7 populat...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287884/ https://www.ncbi.nlm.nih.gov/pubmed/22373404 http://dx.doi.org/10.1186/1753-6561-5-S9-S47 |
_version_ | 1782224765545611264 |
---|---|
author | Nock, NL Zhang, LX |
author_facet | Nock, NL Zhang, LX |
author_sort | Nock, NL |
collection | PubMed |
description | Methods that can evaluate aggregate effects of rare and common variants are limited. Therefore, we applied a two-stage approach to evaluate aggregate gene effects in the 1000 Genomes Project data, which contain 24,487 single-nucleotide polymorphisms (SNPs) in 697 unrelated individuals from 7 populations. In stage 1, we identified potentially interesting genes (PIGs) as those having at least one SNP meeting Bonferroni correction using univariate, multiple regression models. In stage 2, we evaluate aggregate PIG effects on trait, Q1, by modeling each gene as a latent construct, which is defined by multiple common and rare variants, using the multivariate statistical framework of structural equation modeling (SEM). In stage 1, we found that PIGs varied markedly between a randomly selected replicate (replicate 137) and 100 other replicates, with the exception of FLT1. In stage 1, collapsing rare variants decreased false positives but increased false negatives. In stage 2, we developed a good-fitting SEM model that included all nine genes simulated to affect Q1 (FLT1, KDR, ARNT, ELAV4, FLT4, HIF1A, HIF3A, VEGFA, VEGFC) and found that FLT1 had the largest effect on Q1 (β(std) = 0.33 ± 0.05). Using replicate 137 estimates as population values, we found that the mean relative bias in the parameters (loadings, paths, residuals) and their standard errors across 100 replicates was on average, less than 5%. Our latent variable SEM approach provides a viable framework for modeling aggregate effects of rare and common variants in multiple genes, but more elegant methods are needed in stage 1 to minimize type I and type II error. |
format | Online Article Text |
id | pubmed-3287884 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32878842012-02-28 Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling Nock, NL Zhang, LX BMC Proc Proceedings Methods that can evaluate aggregate effects of rare and common variants are limited. Therefore, we applied a two-stage approach to evaluate aggregate gene effects in the 1000 Genomes Project data, which contain 24,487 single-nucleotide polymorphisms (SNPs) in 697 unrelated individuals from 7 populations. In stage 1, we identified potentially interesting genes (PIGs) as those having at least one SNP meeting Bonferroni correction using univariate, multiple regression models. In stage 2, we evaluate aggregate PIG effects on trait, Q1, by modeling each gene as a latent construct, which is defined by multiple common and rare variants, using the multivariate statistical framework of structural equation modeling (SEM). In stage 1, we found that PIGs varied markedly between a randomly selected replicate (replicate 137) and 100 other replicates, with the exception of FLT1. In stage 1, collapsing rare variants decreased false positives but increased false negatives. In stage 2, we developed a good-fitting SEM model that included all nine genes simulated to affect Q1 (FLT1, KDR, ARNT, ELAV4, FLT4, HIF1A, HIF3A, VEGFA, VEGFC) and found that FLT1 had the largest effect on Q1 (β(std) = 0.33 ± 0.05). Using replicate 137 estimates as population values, we found that the mean relative bias in the parameters (loadings, paths, residuals) and their standard errors across 100 replicates was on average, less than 5%. Our latent variable SEM approach provides a viable framework for modeling aggregate effects of rare and common variants in multiple genes, but more elegant methods are needed in stage 1 to minimize type I and type II error. BioMed Central 2011-11-29 /pmc/articles/PMC3287884/ /pubmed/22373404 http://dx.doi.org/10.1186/1753-6561-5-S9-S47 Text en Copyright ©2011 Nock and Zhang; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Nock, NL Zhang, LX Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling |
title | Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling |
title_full | Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling |
title_fullStr | Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling |
title_full_unstemmed | Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling |
title_short | Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling |
title_sort | evaluating aggregate effects of rare and common variants in the 1000 genomes project exon sequencing data using latent variable structural equation modeling |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287884/ https://www.ncbi.nlm.nih.gov/pubmed/22373404 http://dx.doi.org/10.1186/1753-6561-5-S9-S47 |
work_keys_str_mv | AT nocknl evaluatingaggregateeffectsofrareandcommonvariantsinthe1000genomesprojectexonsequencingdatausinglatentvariablestructuralequationmodeling AT zhanglx evaluatingaggregateeffectsofrareandcommonvariantsinthe1000genomesprojectexonsequencingdatausinglatentvariablestructuralequationmodeling |