Cargando…

Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies

BACKGROUND: Identification of disease-related genes in association studies is challenged by the large number of SNPs typed. To address the dilution of power caused by high dimensionality, and to generate results that are biologically interpretable, it is critical to take into consideration spatial c...

Descripción completa

Detalles Bibliográficos
Autores principales: Pattaro, Cristian, Ruczinski, Ingo, Fallin, Danièle M, Parmigiani, Giovanni
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2547855/
https://www.ncbi.nlm.nih.gov/pubmed/18759977
http://dx.doi.org/10.1186/1471-2164-9-405
_version_ 1782159354005291008
author Pattaro, Cristian
Ruczinski, Ingo
Fallin, Danièle M
Parmigiani, Giovanni
author_facet Pattaro, Cristian
Ruczinski, Ingo
Fallin, Danièle M
Parmigiani, Giovanni
author_sort Pattaro, Cristian
collection PubMed
description BACKGROUND: Identification of disease-related genes in association studies is challenged by the large number of SNPs typed. To address the dilution of power caused by high dimensionality, and to generate results that are biologically interpretable, it is critical to take into consideration spatial correlation of SNPs along the genome. With the goal of identifying true genetic associations, partitioning the genome according to spatial correlation can be a powerful and meaningful way to address this dimensionality problem. RESULTS: We developed and validated an MCMC Algorithm To Identify blocks of Linkage DisEquilibrium (MATILDE) for clustering contiguous SNPs, and a statistical testing framework to detect association using partitions as units of analysis. We compared its ability to detect true SNP associations to that of the most commonly used algorithm for block partitioning, as implemented in the Haploview and HapBlock software. Simulations were based on artificially assigning phenotypes to individuals with SNPs corresponding to region 14q11 of the HapMap database. When block partitioning is performed using MATILDE, the ability to correctly identify a disease SNP is higher, especially for small effects, than it is with the alternatives considered. Advantages can be both in terms of true positive findings and limiting the number of false discoveries. Finer partitions provided by LD-based methods or by marker-by-marker analysis are efficient only for detecting big effects, or in presence of large sample sizes. The probabilistic approach we propose offers several additional advantages, including: a) adapting the estimation of blocks to the population, technology, and sample size of the study; b) probabilistic assessment of uncertainty about block boundaries and about whether any two SNPs are in the same block; c) user selection of the probability threshold for assigning SNPs to the same block. CONCLUSION: We demonstrate that, in realistic scenarios, our adaptive, study-specific block partitioning approach is as or more efficient than currently available LD-based approaches in guiding the search for disease loci.
format Text
id pubmed-2547855
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25478552008-09-24 Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies Pattaro, Cristian Ruczinski, Ingo Fallin, Danièle M Parmigiani, Giovanni BMC Genomics Methodology Article BACKGROUND: Identification of disease-related genes in association studies is challenged by the large number of SNPs typed. To address the dilution of power caused by high dimensionality, and to generate results that are biologically interpretable, it is critical to take into consideration spatial correlation of SNPs along the genome. With the goal of identifying true genetic associations, partitioning the genome according to spatial correlation can be a powerful and meaningful way to address this dimensionality problem. RESULTS: We developed and validated an MCMC Algorithm To Identify blocks of Linkage DisEquilibrium (MATILDE) for clustering contiguous SNPs, and a statistical testing framework to detect association using partitions as units of analysis. We compared its ability to detect true SNP associations to that of the most commonly used algorithm for block partitioning, as implemented in the Haploview and HapBlock software. Simulations were based on artificially assigning phenotypes to individuals with SNPs corresponding to region 14q11 of the HapMap database. When block partitioning is performed using MATILDE, the ability to correctly identify a disease SNP is higher, especially for small effects, than it is with the alternatives considered. Advantages can be both in terms of true positive findings and limiting the number of false discoveries. Finer partitions provided by LD-based methods or by marker-by-marker analysis are efficient only for detecting big effects, or in presence of large sample sizes. The probabilistic approach we propose offers several additional advantages, including: a) adapting the estimation of blocks to the population, technology, and sample size of the study; b) probabilistic assessment of uncertainty about block boundaries and about whether any two SNPs are in the same block; c) user selection of the probability threshold for assigning SNPs to the same block. CONCLUSION: We demonstrate that, in realistic scenarios, our adaptive, study-specific block partitioning approach is as or more efficient than currently available LD-based approaches in guiding the search for disease loci. BioMed Central 2008-08-29 /pmc/articles/PMC2547855/ /pubmed/18759977 http://dx.doi.org/10.1186/1471-2164-9-405 Text en Copyright © 2008 Pattaro et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Pattaro, Cristian
Ruczinski, Ingo
Fallin, Danièle M
Parmigiani, Giovanni
Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies
title Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies
title_full Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies
title_fullStr Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies
title_full_unstemmed Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies
title_short Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies
title_sort haplotype block partitioning as a tool for dimensionality reduction in snp association studies
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2547855/
https://www.ncbi.nlm.nih.gov/pubmed/18759977
http://dx.doi.org/10.1186/1471-2164-9-405
work_keys_str_mv AT pattarocristian haplotypeblockpartitioningasatoolfordimensionalityreductioninsnpassociationstudies
AT ruczinskiingo haplotypeblockpartitioningasatoolfordimensionalityreductioninsnpassociationstudies
AT fallindanielem haplotypeblockpartitioningasatoolfordimensionalityreductioninsnpassociationstudies
AT parmigianigiovanni haplotypeblockpartitioningasatoolfordimensionalityreductioninsnpassociationstudies