Cargando…

Rare variants analysis using penalization methods for whole genome sequence data

BACKGROUND: Availability of affordable and accessible whole genome sequencing for biomedical applications poses a number of statistical challenges and opportunities, particularly related to the analysis of rare variants and sparseness of the data. Although efforts have been devoted to address these...

Descripción completa

Detalles Bibliográficos
Autores principales: Yazdani, Akram, Yazdani, Azam, Boerwinkle, Eric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4670502/
https://www.ncbi.nlm.nih.gov/pubmed/26637205
http://dx.doi.org/10.1186/s12859-015-0825-4
_version_ 1782404266645782528
author Yazdani, Akram
Yazdani, Azam
Boerwinkle, Eric
author_facet Yazdani, Akram
Yazdani, Azam
Boerwinkle, Eric
author_sort Yazdani, Akram
collection PubMed
description BACKGROUND: Availability of affordable and accessible whole genome sequencing for biomedical applications poses a number of statistical challenges and opportunities, particularly related to the analysis of rare variants and sparseness of the data. Although efforts have been devoted to address these challenges, the performance of statistical methods for rare variants analysis still needs further consideration. RESULT: We introduce a new approach that applies restricted principal component analysis with convex penalization and then selects the best predictors of a phenotype by a concave penalized regression model, while estimating the impact of each genomic region on the phenotype. Using simulated data, we show that the proposed method maintains good power for association testing while keeping the false discovery rate low under a verity of genetic architectures. Illustrative data analyses reveal encouraging result of this method in comparison with other commonly applied methods for rare variants analysis. CONCLUSION: By taking into account linkage disequilibrium and sparseness of the data, the proposed method improves power and controls the false discovery rate compared to other commonly applied methods for rare variant analyses.
format Online
Article
Text
id pubmed-4670502
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46705022015-12-06 Rare variants analysis using penalization methods for whole genome sequence data Yazdani, Akram Yazdani, Azam Boerwinkle, Eric BMC Bioinformatics Methodology Article BACKGROUND: Availability of affordable and accessible whole genome sequencing for biomedical applications poses a number of statistical challenges and opportunities, particularly related to the analysis of rare variants and sparseness of the data. Although efforts have been devoted to address these challenges, the performance of statistical methods for rare variants analysis still needs further consideration. RESULT: We introduce a new approach that applies restricted principal component analysis with convex penalization and then selects the best predictors of a phenotype by a concave penalized regression model, while estimating the impact of each genomic region on the phenotype. Using simulated data, we show that the proposed method maintains good power for association testing while keeping the false discovery rate low under a verity of genetic architectures. Illustrative data analyses reveal encouraging result of this method in comparison with other commonly applied methods for rare variants analysis. CONCLUSION: By taking into account linkage disequilibrium and sparseness of the data, the proposed method improves power and controls the false discovery rate compared to other commonly applied methods for rare variant analyses. BioMed Central 2015-12-04 /pmc/articles/PMC4670502/ /pubmed/26637205 http://dx.doi.org/10.1186/s12859-015-0825-4 Text en © Yazdani et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Yazdani, Akram
Yazdani, Azam
Boerwinkle, Eric
Rare variants analysis using penalization methods for whole genome sequence data
title Rare variants analysis using penalization methods for whole genome sequence data
title_full Rare variants analysis using penalization methods for whole genome sequence data
title_fullStr Rare variants analysis using penalization methods for whole genome sequence data
title_full_unstemmed Rare variants analysis using penalization methods for whole genome sequence data
title_short Rare variants analysis using penalization methods for whole genome sequence data
title_sort rare variants analysis using penalization methods for whole genome sequence data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4670502/
https://www.ncbi.nlm.nih.gov/pubmed/26637205
http://dx.doi.org/10.1186/s12859-015-0825-4
work_keys_str_mv AT yazdaniakram rarevariantsanalysisusingpenalizationmethodsforwholegenomesequencedata
AT yazdaniazam rarevariantsanalysisusingpenalizationmethodsforwholegenomesequencedata
AT boerwinkleeric rarevariantsanalysisusingpenalizationmethodsforwholegenomesequencedata