Cargando…
Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17
The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulate...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287844/ https://www.ncbi.nlm.nih.gov/pubmed/22373385 http://dx.doi.org/10.1186/1753-6561-5-S9-S12 |
_version_ | 1782224756516323328 |
---|---|
author | Guo, Wei Elston, Robert C Zhu, Xiaofeng |
author_facet | Guo, Wei Elston, Robert C Zhu, Xiaofeng |
author_sort | Guo, Wei |
collection | PubMed |
description | The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulated phenotypes of trait Q2. An important feature of this data set is that most SNPs are rare, with 87% of the SNPs having a minor allele frequency less than 0.05. For rare SNP detection, in this study we performed a least absolute shrinkage and selection operator (LASSO) regression and F tests at the gene level and calculated the generalized degrees of freedom to avoid any selection bias. For comparison, we also carried out linear regression and the collapsing method, which sums the rare SNPs, modified for a quantitative trait and with two different allele frequency thresholds. The aim of this paper is to evaluate these four approaches in this mini-exome data and compare their performance in terms of power and false positive rates. In most situations the LASSO approach is more powerful than linear regression and collapsing methods. We also note the difficulty in determining the optimal threshold for the collapsing method and the significant role that linkage disequilibrium plays in detecting rare causal SNPs. If a rare causal SNP is in strong linkage disequilibrium with a common marker in the same gene, power will be much improved. |
format | Online Article Text |
id | pubmed-3287844 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32878442012-02-28 Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 Guo, Wei Elston, Robert C Zhu, Xiaofeng BMC Proc Proceedings The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulated phenotypes of trait Q2. An important feature of this data set is that most SNPs are rare, with 87% of the SNPs having a minor allele frequency less than 0.05. For rare SNP detection, in this study we performed a least absolute shrinkage and selection operator (LASSO) regression and F tests at the gene level and calculated the generalized degrees of freedom to avoid any selection bias. For comparison, we also carried out linear regression and the collapsing method, which sums the rare SNPs, modified for a quantitative trait and with two different allele frequency thresholds. The aim of this paper is to evaluate these four approaches in this mini-exome data and compare their performance in terms of power and false positive rates. In most situations the LASSO approach is more powerful than linear regression and collapsing methods. We also note the difficulty in determining the optimal threshold for the collapsing method and the significant role that linkage disequilibrium plays in detecting rare causal SNPs. If a rare causal SNP is in strong linkage disequilibrium with a common marker in the same gene, power will be much improved. BioMed Central 2011-11-29 /pmc/articles/PMC3287844/ /pubmed/22373385 http://dx.doi.org/10.1186/1753-6561-5-S9-S12 Text en Copyright ©2011 Guo et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Guo, Wei Elston, Robert C Zhu, Xiaofeng Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 |
title | Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 |
title_full | Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 |
title_fullStr | Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 |
title_full_unstemmed | Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 |
title_short | Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 |
title_sort | evaluation of a lasso regression approach on the unrelated samples of genetic analysis workshop 17 |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287844/ https://www.ncbi.nlm.nih.gov/pubmed/22373385 http://dx.doi.org/10.1186/1753-6561-5-S9-S12 |
work_keys_str_mv | AT guowei evaluationofalassoregressionapproachontheunrelatedsamplesofgeneticanalysisworkshop17 AT elstonrobertc evaluationofalassoregressionapproachontheunrelatedsamplesofgeneticanalysisworkshop17 AT zhuxiaofeng evaluationofalassoregressionapproachontheunrelatedsamplesofgeneticanalysisworkshop17 |