Cargando…

Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis

Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus...

Descripción completa

Detalles Bibliográficos
Autores principales: Hong, Sungyeon, Kim, Yongkang, Park, Taesung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298256/
https://www.ncbi.nlm.nih.gov/pubmed/25635166
http://dx.doi.org/10.4137/CIN.S16350
_version_ 1782353243817377792
author Hong, Sungyeon
Kim, Yongkang
Park, Taesung
author_facet Hong, Sungyeon
Kim, Yongkang
Park, Taesung
author_sort Hong, Sungyeon
collection PubMed
description Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus on identifying single nucleotide polymorphisms (SNPs) associated with a disease of interest, have produced ultrahigh-dimensional data. Numerous methods have been proposed to handle GWAS data. Most statistical methods have adopted a two-stage approach: pre-screening for dimensional reduction and variable selection to identify causal SNPs. The pre-screening step selects SNPs in terms of their P-values or the absolute values of the regression coefficients in single SNP analysis. Penalized regressions, such as the ridge, lasso, adaptive lasso, and elastic-net regressions, are commonly used for the variable selection step. In this paper, we investigate which combination of pre-screening method and penalized regression performs best on a quantitative phenotype using two real GWAS datasets.
format Online
Article
Text
id pubmed-4298256
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-42982562015-01-29 Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis Hong, Sungyeon Kim, Yongkang Park, Taesung Cancer Inform Review Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus on identifying single nucleotide polymorphisms (SNPs) associated with a disease of interest, have produced ultrahigh-dimensional data. Numerous methods have been proposed to handle GWAS data. Most statistical methods have adopted a two-stage approach: pre-screening for dimensional reduction and variable selection to identify causal SNPs. The pre-screening step selects SNPs in terms of their P-values or the absolute values of the regression coefficients in single SNP analysis. Penalized regressions, such as the ridge, lasso, adaptive lasso, and elastic-net regressions, are commonly used for the variable selection step. In this paper, we investigate which combination of pre-screening method and penalized regression performs best on a quantitative phenotype using two real GWAS datasets. Libertas Academica 2015-01-14 /pmc/articles/PMC4298256/ /pubmed/25635166 http://dx.doi.org/10.4137/CIN.S16350 Text en © 2014 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License.
spellingShingle Review
Hong, Sungyeon
Kim, Yongkang
Park, Taesung
Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis
title Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis
title_full Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis
title_fullStr Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis
title_full_unstemmed Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis
title_short Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis
title_sort practical issues in screening and variable selection in genome-wide association analysis
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298256/
https://www.ncbi.nlm.nih.gov/pubmed/25635166
http://dx.doi.org/10.4137/CIN.S16350
work_keys_str_mv AT hongsungyeon practicalissuesinscreeningandvariableselectioningenomewideassociationanalysis
AT kimyongkang practicalissuesinscreeningandvariableselectioningenomewideassociationanalysis
AT parktaesung practicalissuesinscreeningandvariableselectioningenomewideassociationanalysis