Cargando…

Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17

The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulate...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Wei, Elston, Robert C, Zhu, Xiaofeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287844/
https://www.ncbi.nlm.nih.gov/pubmed/22373385
http://dx.doi.org/10.1186/1753-6561-5-S9-S12
_version_ 1782224756516323328
author Guo, Wei
Elston, Robert C
Zhu, Xiaofeng
author_facet Guo, Wei
Elston, Robert C
Zhu, Xiaofeng
author_sort Guo, Wei
collection PubMed
description The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulated phenotypes of trait Q2. An important feature of this data set is that most SNPs are rare, with 87% of the SNPs having a minor allele frequency less than 0.05. For rare SNP detection, in this study we performed a least absolute shrinkage and selection operator (LASSO) regression and F tests at the gene level and calculated the generalized degrees of freedom to avoid any selection bias. For comparison, we also carried out linear regression and the collapsing method, which sums the rare SNPs, modified for a quantitative trait and with two different allele frequency thresholds. The aim of this paper is to evaluate these four approaches in this mini-exome data and compare their performance in terms of power and false positive rates. In most situations the LASSO approach is more powerful than linear regression and collapsing methods. We also note the difficulty in determining the optimal threshold for the collapsing method and the significant role that linkage disequilibrium plays in detecting rare causal SNPs. If a rare causal SNP is in strong linkage disequilibrium with a common marker in the same gene, power will be much improved.
format Online
Article
Text
id pubmed-3287844
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32878442012-02-28 Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 Guo, Wei Elston, Robert C Zhu, Xiaofeng BMC Proc Proceedings The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulated phenotypes of trait Q2. An important feature of this data set is that most SNPs are rare, with 87% of the SNPs having a minor allele frequency less than 0.05. For rare SNP detection, in this study we performed a least absolute shrinkage and selection operator (LASSO) regression and F tests at the gene level and calculated the generalized degrees of freedom to avoid any selection bias. For comparison, we also carried out linear regression and the collapsing method, which sums the rare SNPs, modified for a quantitative trait and with two different allele frequency thresholds. The aim of this paper is to evaluate these four approaches in this mini-exome data and compare their performance in terms of power and false positive rates. In most situations the LASSO approach is more powerful than linear regression and collapsing methods. We also note the difficulty in determining the optimal threshold for the collapsing method and the significant role that linkage disequilibrium plays in detecting rare causal SNPs. If a rare causal SNP is in strong linkage disequilibrium with a common marker in the same gene, power will be much improved. BioMed Central 2011-11-29 /pmc/articles/PMC3287844/ /pubmed/22373385 http://dx.doi.org/10.1186/1753-6561-5-S9-S12 Text en Copyright ©2011 Guo et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Guo, Wei
Elston, Robert C
Zhu, Xiaofeng
Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17
title Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17
title_full Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17
title_fullStr Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17
title_full_unstemmed Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17
title_short Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17
title_sort evaluation of a lasso regression approach on the unrelated samples of genetic analysis workshop 17
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287844/
https://www.ncbi.nlm.nih.gov/pubmed/22373385
http://dx.doi.org/10.1186/1753-6561-5-S9-S12
work_keys_str_mv AT guowei evaluationofalassoregressionapproachontheunrelatedsamplesofgeneticanalysisworkshop17
AT elstonrobertc evaluationofalassoregressionapproachontheunrelatedsamplesofgeneticanalysisworkshop17
AT zhuxiaofeng evaluationofalassoregressionapproachontheunrelatedsamplesofgeneticanalysisworkshop17