Cargando…
Learning the optimal scale for GWAS through hierarchical SNP aggregation
BACKGROUND: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy individuals being tested against affected individua...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6267789/ https://www.ncbi.nlm.nih.gov/pubmed/30497371 http://dx.doi.org/10.1186/s12859-018-2475-9 |
_version_ | 1783376154585989120 |
---|---|
author | Guinot, Florent Szafranski, Marie Ambroise, Christophe Samson, Franck |
author_facet | Guinot, Florent Szafranski, Marie Ambroise, Christophe Samson, Franck |
author_sort | Guinot, Florent |
collection | PubMed |
description | BACKGROUND: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy individuals being tested against affected individuals at each locus. Given that an individual’s genotype is characterized by up to one million SNPs, this approach lacks precision, since it may yield a large number of false positives that can lead to erroneous conclusions about genetic associations with the disease. One way to improve the detection of true genetic associations is to reduce the number of hypotheses to be tested by grouping SNPs. RESULTS: We propose a dimension-reduction approach which can be applied in the context of GWAS by making use of the haplotype structure of the human genome. We compare our method with standard univariate and group-based approaches on both synthetic and real GWAS data. CONCLUSION: We show that reducing the dimension of the predictor matrix by aggregating SNPs gives a greater precision in the detection of associations between the phenotype and genomic regions. |
format | Online Article Text |
id | pubmed-6267789 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62677892018-12-05 Learning the optimal scale for GWAS through hierarchical SNP aggregation Guinot, Florent Szafranski, Marie Ambroise, Christophe Samson, Franck BMC Bioinformatics Methodology Article BACKGROUND: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy individuals being tested against affected individuals at each locus. Given that an individual’s genotype is characterized by up to one million SNPs, this approach lacks precision, since it may yield a large number of false positives that can lead to erroneous conclusions about genetic associations with the disease. One way to improve the detection of true genetic associations is to reduce the number of hypotheses to be tested by grouping SNPs. RESULTS: We propose a dimension-reduction approach which can be applied in the context of GWAS by making use of the haplotype structure of the human genome. We compare our method with standard univariate and group-based approaches on both synthetic and real GWAS data. CONCLUSION: We show that reducing the dimension of the predictor matrix by aggregating SNPs gives a greater precision in the detection of associations between the phenotype and genomic regions. BioMed Central 2018-11-29 /pmc/articles/PMC6267789/ /pubmed/30497371 http://dx.doi.org/10.1186/s12859-018-2475-9 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Guinot, Florent Szafranski, Marie Ambroise, Christophe Samson, Franck Learning the optimal scale for GWAS through hierarchical SNP aggregation |
title | Learning the optimal scale for GWAS through hierarchical SNP aggregation |
title_full | Learning the optimal scale for GWAS through hierarchical SNP aggregation |
title_fullStr | Learning the optimal scale for GWAS through hierarchical SNP aggregation |
title_full_unstemmed | Learning the optimal scale for GWAS through hierarchical SNP aggregation |
title_short | Learning the optimal scale for GWAS through hierarchical SNP aggregation |
title_sort | learning the optimal scale for gwas through hierarchical snp aggregation |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6267789/ https://www.ncbi.nlm.nih.gov/pubmed/30497371 http://dx.doi.org/10.1186/s12859-018-2475-9 |
work_keys_str_mv | AT guinotflorent learningtheoptimalscaleforgwasthroughhierarchicalsnpaggregation AT szafranskimarie learningtheoptimalscaleforgwasthroughhierarchicalsnpaggregation AT ambroisechristophe learningtheoptimalscaleforgwasthroughhierarchicalsnpaggregation AT samsonfranck learningtheoptimalscaleforgwasthroughhierarchicalsnpaggregation |