Cargando…

Learning the optimal scale for GWAS through hierarchical SNP aggregation

BACKGROUND: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy individuals being tested against affected individua...

Descripción completa

Detalles Bibliográficos
Autores principales: Guinot, Florent, Szafranski, Marie, Ambroise, Christophe, Samson, Franck
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6267789/
https://www.ncbi.nlm.nih.gov/pubmed/30497371
http://dx.doi.org/10.1186/s12859-018-2475-9
_version_ 1783376154585989120
author Guinot, Florent
Szafranski, Marie
Ambroise, Christophe
Samson, Franck
author_facet Guinot, Florent
Szafranski, Marie
Ambroise, Christophe
Samson, Franck
author_sort Guinot, Florent
collection PubMed
description BACKGROUND: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy individuals being tested against affected individuals at each locus. Given that an individual’s genotype is characterized by up to one million SNPs, this approach lacks precision, since it may yield a large number of false positives that can lead to erroneous conclusions about genetic associations with the disease. One way to improve the detection of true genetic associations is to reduce the number of hypotheses to be tested by grouping SNPs. RESULTS: We propose a dimension-reduction approach which can be applied in the context of GWAS by making use of the haplotype structure of the human genome. We compare our method with standard univariate and group-based approaches on both synthetic and real GWAS data. CONCLUSION: We show that reducing the dimension of the predictor matrix by aggregating SNPs gives a greater precision in the detection of associations between the phenotype and genomic regions.
format Online
Article
Text
id pubmed-6267789
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62677892018-12-05 Learning the optimal scale for GWAS through hierarchical SNP aggregation Guinot, Florent Szafranski, Marie Ambroise, Christophe Samson, Franck BMC Bioinformatics Methodology Article BACKGROUND: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy individuals being tested against affected individuals at each locus. Given that an individual’s genotype is characterized by up to one million SNPs, this approach lacks precision, since it may yield a large number of false positives that can lead to erroneous conclusions about genetic associations with the disease. One way to improve the detection of true genetic associations is to reduce the number of hypotheses to be tested by grouping SNPs. RESULTS: We propose a dimension-reduction approach which can be applied in the context of GWAS by making use of the haplotype structure of the human genome. We compare our method with standard univariate and group-based approaches on both synthetic and real GWAS data. CONCLUSION: We show that reducing the dimension of the predictor matrix by aggregating SNPs gives a greater precision in the detection of associations between the phenotype and genomic regions. BioMed Central 2018-11-29 /pmc/articles/PMC6267789/ /pubmed/30497371 http://dx.doi.org/10.1186/s12859-018-2475-9 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Guinot, Florent
Szafranski, Marie
Ambroise, Christophe
Samson, Franck
Learning the optimal scale for GWAS through hierarchical SNP aggregation
title Learning the optimal scale for GWAS through hierarchical SNP aggregation
title_full Learning the optimal scale for GWAS through hierarchical SNP aggregation
title_fullStr Learning the optimal scale for GWAS through hierarchical SNP aggregation
title_full_unstemmed Learning the optimal scale for GWAS through hierarchical SNP aggregation
title_short Learning the optimal scale for GWAS through hierarchical SNP aggregation
title_sort learning the optimal scale for gwas through hierarchical snp aggregation
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6267789/
https://www.ncbi.nlm.nih.gov/pubmed/30497371
http://dx.doi.org/10.1186/s12859-018-2475-9
work_keys_str_mv AT guinotflorent learningtheoptimalscaleforgwasthroughhierarchicalsnpaggregation
AT szafranskimarie learningtheoptimalscaleforgwasthroughhierarchicalsnpaggregation
AT ambroisechristophe learningtheoptimalscaleforgwasthroughhierarchicalsnpaggregation
AT samsonfranck learningtheoptimalscaleforgwasthroughhierarchicalsnpaggregation