Cargando…

SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association

Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of para...

Descripción completa

Detalles Bibliográficos
Autores principales: Dai, James Y., Leblanc, Michael, Smith, Nicholas L., Psaty, Bruce, Kooperberg, Charles
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2742496/
https://www.ncbi.nlm.nih.gov/pubmed/19605740
http://dx.doi.org/10.1093/biostatistics/kxp023
_version_ 1782171821352681472
author Dai, James Y.
Leblanc, Michael
Smith, Nicholas L.
Psaty, Bruce
Kooperberg, Charles
author_facet Dai, James Y.
Leblanc, Michael
Smith, Nicholas L.
Psaty, Bruce
Kooperberg, Charles
author_sort Dai, James Y.
collection PubMed
description Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of parameters, thus potentially losing power. Clustering algorithms, such as the cladistic approach, have been proposed to reduce the dimensionality, yet they have important limitations. We propose a SNP-Haplotype Adaptive REgression (SHARE) algorithm that seeks the most informative set of SNPs for genetic association in a targeted candidate region by growing and shrinking haplotypes with 1 more or less SNP in a stepwise fashion, and comparing prediction errors of different models via cross-validation. Depending on the evolutionary history of the disease mutations and the markers, this set may contain a single SNP or several SNPs that lay a foundation for haplotype analyses. Haplotype phase ambiguity is effectively accounted for by treating haplotype reconstruction as a part of the learning procedure. Simulations and a data application show that our method has improved power over existing methodologies and that the results are informative in the search for disease-causal loci.
format Text
id pubmed-2742496
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27424962009-09-14 SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association Dai, James Y. Leblanc, Michael Smith, Nicholas L. Psaty, Bruce Kooperberg, Charles Biostatistics Articles Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of parameters, thus potentially losing power. Clustering algorithms, such as the cladistic approach, have been proposed to reduce the dimensionality, yet they have important limitations. We propose a SNP-Haplotype Adaptive REgression (SHARE) algorithm that seeks the most informative set of SNPs for genetic association in a targeted candidate region by growing and shrinking haplotypes with 1 more or less SNP in a stepwise fashion, and comparing prediction errors of different models via cross-validation. Depending on the evolutionary history of the disease mutations and the markers, this set may contain a single SNP or several SNPs that lay a foundation for haplotype analyses. Haplotype phase ambiguity is effectively accounted for by treating haplotype reconstruction as a part of the learning procedure. Simulations and a data application show that our method has improved power over existing methodologies and that the results are informative in the search for disease-causal loci. Oxford University Press 2009-10 2009-07-15 /pmc/articles/PMC2742496/ /pubmed/19605740 http://dx.doi.org/10.1093/biostatistics/kxp023 Text en © 2009 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Articles
Dai, James Y.
Leblanc, Michael
Smith, Nicholas L.
Psaty, Bruce
Kooperberg, Charles
SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association
title SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association
title_full SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association
title_fullStr SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association
title_full_unstemmed SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association
title_short SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association
title_sort share: an adaptive algorithm to select the most informative set of snps for candidate genetic association
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2742496/
https://www.ncbi.nlm.nih.gov/pubmed/19605740
http://dx.doi.org/10.1093/biostatistics/kxp023
work_keys_str_mv AT daijamesy shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation
AT leblancmichael shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation
AT smithnicholasl shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation
AT psatybruce shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation
AT kooperbergcharles shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation