Cargando…
SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association
Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of para...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2742496/ https://www.ncbi.nlm.nih.gov/pubmed/19605740 http://dx.doi.org/10.1093/biostatistics/kxp023 |
_version_ | 1782171821352681472 |
---|---|
author | Dai, James Y. Leblanc, Michael Smith, Nicholas L. Psaty, Bruce Kooperberg, Charles |
author_facet | Dai, James Y. Leblanc, Michael Smith, Nicholas L. Psaty, Bruce Kooperberg, Charles |
author_sort | Dai, James Y. |
collection | PubMed |
description | Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of parameters, thus potentially losing power. Clustering algorithms, such as the cladistic approach, have been proposed to reduce the dimensionality, yet they have important limitations. We propose a SNP-Haplotype Adaptive REgression (SHARE) algorithm that seeks the most informative set of SNPs for genetic association in a targeted candidate region by growing and shrinking haplotypes with 1 more or less SNP in a stepwise fashion, and comparing prediction errors of different models via cross-validation. Depending on the evolutionary history of the disease mutations and the markers, this set may contain a single SNP or several SNPs that lay a foundation for haplotype analyses. Haplotype phase ambiguity is effectively accounted for by treating haplotype reconstruction as a part of the learning procedure. Simulations and a data application show that our method has improved power over existing methodologies and that the results are informative in the search for disease-causal loci. |
format | Text |
id | pubmed-2742496 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-27424962009-09-14 SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association Dai, James Y. Leblanc, Michael Smith, Nicholas L. Psaty, Bruce Kooperberg, Charles Biostatistics Articles Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of parameters, thus potentially losing power. Clustering algorithms, such as the cladistic approach, have been proposed to reduce the dimensionality, yet they have important limitations. We propose a SNP-Haplotype Adaptive REgression (SHARE) algorithm that seeks the most informative set of SNPs for genetic association in a targeted candidate region by growing and shrinking haplotypes with 1 more or less SNP in a stepwise fashion, and comparing prediction errors of different models via cross-validation. Depending on the evolutionary history of the disease mutations and the markers, this set may contain a single SNP or several SNPs that lay a foundation for haplotype analyses. Haplotype phase ambiguity is effectively accounted for by treating haplotype reconstruction as a part of the learning procedure. Simulations and a data application show that our method has improved power over existing methodologies and that the results are informative in the search for disease-causal loci. Oxford University Press 2009-10 2009-07-15 /pmc/articles/PMC2742496/ /pubmed/19605740 http://dx.doi.org/10.1093/biostatistics/kxp023 Text en © 2009 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Articles Dai, James Y. Leblanc, Michael Smith, Nicholas L. Psaty, Bruce Kooperberg, Charles SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association |
title | SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association |
title_full | SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association |
title_fullStr | SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association |
title_full_unstemmed | SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association |
title_short | SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association |
title_sort | share: an adaptive algorithm to select the most informative set of snps for candidate genetic association |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2742496/ https://www.ncbi.nlm.nih.gov/pubmed/19605740 http://dx.doi.org/10.1093/biostatistics/kxp023 |
work_keys_str_mv | AT daijamesy shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation AT leblancmichael shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation AT smithnicholasl shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation AT psatybruce shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation AT kooperbergcharles shareanadaptivealgorithmtoselectthemostinformativesetofsnpsforcandidategeneticassociation |