Cargando…
Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis
Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Libertas Academica
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298256/ https://www.ncbi.nlm.nih.gov/pubmed/25635166 http://dx.doi.org/10.4137/CIN.S16350 |
_version_ | 1782353243817377792 |
---|---|
author | Hong, Sungyeon Kim, Yongkang Park, Taesung |
author_facet | Hong, Sungyeon Kim, Yongkang Park, Taesung |
author_sort | Hong, Sungyeon |
collection | PubMed |
description | Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus on identifying single nucleotide polymorphisms (SNPs) associated with a disease of interest, have produced ultrahigh-dimensional data. Numerous methods have been proposed to handle GWAS data. Most statistical methods have adopted a two-stage approach: pre-screening for dimensional reduction and variable selection to identify causal SNPs. The pre-screening step selects SNPs in terms of their P-values or the absolute values of the regression coefficients in single SNP analysis. Penalized regressions, such as the ridge, lasso, adaptive lasso, and elastic-net regressions, are commonly used for the variable selection step. In this paper, we investigate which combination of pre-screening method and penalized regression performs best on a quantitative phenotype using two real GWAS datasets. |
format | Online Article Text |
id | pubmed-4298256 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Libertas Academica |
record_format | MEDLINE/PubMed |
spelling | pubmed-42982562015-01-29 Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis Hong, Sungyeon Kim, Yongkang Park, Taesung Cancer Inform Review Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus on identifying single nucleotide polymorphisms (SNPs) associated with a disease of interest, have produced ultrahigh-dimensional data. Numerous methods have been proposed to handle GWAS data. Most statistical methods have adopted a two-stage approach: pre-screening for dimensional reduction and variable selection to identify causal SNPs. The pre-screening step selects SNPs in terms of their P-values or the absolute values of the regression coefficients in single SNP analysis. Penalized regressions, such as the ridge, lasso, adaptive lasso, and elastic-net regressions, are commonly used for the variable selection step. In this paper, we investigate which combination of pre-screening method and penalized regression performs best on a quantitative phenotype using two real GWAS datasets. Libertas Academica 2015-01-14 /pmc/articles/PMC4298256/ /pubmed/25635166 http://dx.doi.org/10.4137/CIN.S16350 Text en © 2014 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License. |
spellingShingle | Review Hong, Sungyeon Kim, Yongkang Park, Taesung Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis |
title | Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis |
title_full | Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis |
title_fullStr | Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis |
title_full_unstemmed | Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis |
title_short | Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis |
title_sort | practical issues in screening and variable selection in genome-wide association analysis |
topic | Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298256/ https://www.ncbi.nlm.nih.gov/pubmed/25635166 http://dx.doi.org/10.4137/CIN.S16350 |
work_keys_str_mv | AT hongsungyeon practicalissuesinscreeningandvariableselectioningenomewideassociationanalysis AT kimyongkang practicalissuesinscreeningandvariableselectioningenomewideassociationanalysis AT parktaesung practicalissuesinscreeningandvariableselectioningenomewideassociationanalysis |