Cargando…

Significance testing in ridge regression for genetic data

BACKGROUND: Technological developments have increased the feasibility of large scale genetic association studies. Densely typed genetic markers are obtained using SNP arrays, next-generation sequencing technologies and imputation. However, SNPs typed using these methods can be highly correlated due...

Descripción completa

Detalles Bibliográficos
Autores principales: Cule, Erika, Vineis, Paolo, De Iorio, Maria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228544/
https://www.ncbi.nlm.nih.gov/pubmed/21929786
http://dx.doi.org/10.1186/1471-2105-12-372
_version_ 1782217830255558656
author Cule, Erika
Vineis, Paolo
De Iorio, Maria
author_facet Cule, Erika
Vineis, Paolo
De Iorio, Maria
author_sort Cule, Erika
collection PubMed
description BACKGROUND: Technological developments have increased the feasibility of large scale genetic association studies. Densely typed genetic markers are obtained using SNP arrays, next-generation sequencing technologies and imputation. However, SNPs typed using these methods can be highly correlated due to linkage disequilibrium among them, and standard multiple regression techniques fail with these data sets due to their high dimensionality and correlation structure. There has been increasing interest in using penalised regression in the analysis of high dimensional data. Ridge regression is one such penalised regression technique which does not perform variable selection, instead estimating a regression coefficient for each predictor variable. It is therefore desirable to obtain an estimate of the significance of each ridge regression coefficient. RESULTS: We develop and evaluate a test of significance for ridge regression coefficients. Using simulation studies, we demonstrate that the performance of the test is comparable to that of a permutation test, with the advantage of a much-reduced computational cost. We introduce the p-value trace, a plot of the negative logarithm of the p-values of ridge regression coefficients with increasing shrinkage parameter, which enables the visualisation of the change in p-value of the regression coefficients with increasing penalisation. We apply the proposed method to a lung cancer case-control data set from EPIC, the European Prospective Investigation into Cancer and Nutrition. CONCLUSIONS: The proposed test is a useful alternative to a permutation test for the estimation of the significance of ridge regression coefficients, at a much-reduced computational cost. The p-value trace is an informative graphical tool for evaluating the results of a test of significance of ridge regression coefficients as the shrinkage parameter increases, and the proposed test makes its production computationally feasible.
format Online
Article
Text
id pubmed-3228544
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32285442011-12-07 Significance testing in ridge regression for genetic data Cule, Erika Vineis, Paolo De Iorio, Maria BMC Bioinformatics Research Article BACKGROUND: Technological developments have increased the feasibility of large scale genetic association studies. Densely typed genetic markers are obtained using SNP arrays, next-generation sequencing technologies and imputation. However, SNPs typed using these methods can be highly correlated due to linkage disequilibrium among them, and standard multiple regression techniques fail with these data sets due to their high dimensionality and correlation structure. There has been increasing interest in using penalised regression in the analysis of high dimensional data. Ridge regression is one such penalised regression technique which does not perform variable selection, instead estimating a regression coefficient for each predictor variable. It is therefore desirable to obtain an estimate of the significance of each ridge regression coefficient. RESULTS: We develop and evaluate a test of significance for ridge regression coefficients. Using simulation studies, we demonstrate that the performance of the test is comparable to that of a permutation test, with the advantage of a much-reduced computational cost. We introduce the p-value trace, a plot of the negative logarithm of the p-values of ridge regression coefficients with increasing shrinkage parameter, which enables the visualisation of the change in p-value of the regression coefficients with increasing penalisation. We apply the proposed method to a lung cancer case-control data set from EPIC, the European Prospective Investigation into Cancer and Nutrition. CONCLUSIONS: The proposed test is a useful alternative to a permutation test for the estimation of the significance of ridge regression coefficients, at a much-reduced computational cost. The p-value trace is an informative graphical tool for evaluating the results of a test of significance of ridge regression coefficients as the shrinkage parameter increases, and the proposed test makes its production computationally feasible. BioMed Central 2011-09-19 /pmc/articles/PMC3228544/ /pubmed/21929786 http://dx.doi.org/10.1186/1471-2105-12-372 Text en Copyright ©2011 Cule et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Cule, Erika
Vineis, Paolo
De Iorio, Maria
Significance testing in ridge regression for genetic data
title Significance testing in ridge regression for genetic data
title_full Significance testing in ridge regression for genetic data
title_fullStr Significance testing in ridge regression for genetic data
title_full_unstemmed Significance testing in ridge regression for genetic data
title_short Significance testing in ridge regression for genetic data
title_sort significance testing in ridge regression for genetic data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228544/
https://www.ncbi.nlm.nih.gov/pubmed/21929786
http://dx.doi.org/10.1186/1471-2105-12-372
work_keys_str_mv AT culeerika significancetestinginridgeregressionforgeneticdata
AT vineispaolo significancetestinginridgeregressionforgeneticdata
AT deioriomaria significancetestinginridgeregressionforgeneticdata