Cargando…

A double classification tree search algorithm for index SNP selection

BACKGROUND: In population-based studies, it is generally recognized that single nucleotide polymorphism (SNP) markers are not independent. Rather, they are carried by haplotypes, groups of SNPs that tend to be coinherited. It is thus possible to choose a much smaller number of SNPs to use as indices...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Peisen, Sheng, Huitao, Uehara, Ryuhei
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC476734/
https://www.ncbi.nlm.nih.gov/pubmed/15238162
http://dx.doi.org/10.1186/1471-2105-5-89
_version_ 1782121634908340224
author Zhang, Peisen
Sheng, Huitao
Uehara, Ryuhei
author_facet Zhang, Peisen
Sheng, Huitao
Uehara, Ryuhei
author_sort Zhang, Peisen
collection PubMed
description BACKGROUND: In population-based studies, it is generally recognized that single nucleotide polymorphism (SNP) markers are not independent. Rather, they are carried by haplotypes, groups of SNPs that tend to be coinherited. It is thus possible to choose a much smaller number of SNPs to use as indices for identifying haplotypes or haplotype blocks in genetic association studies. We refer to these characteristic SNPs as index SNPs. In order to reduce costs and work, a minimum number of index SNPs that can distinguish all SNP and haplotype patterns should be chosen. Unfortunately, this is an NP-complete problem, requiring brute force algorithms that are not feasible for large data sets. RESULTS: We have developed a double classification tree search algorithm to generate index SNPs that can distinguish all SNP and haplotype patterns. This algorithm runs very rapidly and generates very good, though not necessarily minimum, sets of index SNPs, as is to be expected for such NP-complete problems. CONCLUSIONS: A new algorithm for index SNP selection has been developed. A webserver for index SNP selection is available at
format Text
id pubmed-476734
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-4767342004-07-18 A double classification tree search algorithm for index SNP selection Zhang, Peisen Sheng, Huitao Uehara, Ryuhei BMC Bioinformatics Software BACKGROUND: In population-based studies, it is generally recognized that single nucleotide polymorphism (SNP) markers are not independent. Rather, they are carried by haplotypes, groups of SNPs that tend to be coinherited. It is thus possible to choose a much smaller number of SNPs to use as indices for identifying haplotypes or haplotype blocks in genetic association studies. We refer to these characteristic SNPs as index SNPs. In order to reduce costs and work, a minimum number of index SNPs that can distinguish all SNP and haplotype patterns should be chosen. Unfortunately, this is an NP-complete problem, requiring brute force algorithms that are not feasible for large data sets. RESULTS: We have developed a double classification tree search algorithm to generate index SNPs that can distinguish all SNP and haplotype patterns. This algorithm runs very rapidly and generates very good, though not necessarily minimum, sets of index SNPs, as is to be expected for such NP-complete problems. CONCLUSIONS: A new algorithm for index SNP selection has been developed. A webserver for index SNP selection is available at BioMed Central 2004-07-06 /pmc/articles/PMC476734/ /pubmed/15238162 http://dx.doi.org/10.1186/1471-2105-5-89 Text en Copyright © 2004 Zhang et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Software
Zhang, Peisen
Sheng, Huitao
Uehara, Ryuhei
A double classification tree search algorithm for index SNP selection
title A double classification tree search algorithm for index SNP selection
title_full A double classification tree search algorithm for index SNP selection
title_fullStr A double classification tree search algorithm for index SNP selection
title_full_unstemmed A double classification tree search algorithm for index SNP selection
title_short A double classification tree search algorithm for index SNP selection
title_sort double classification tree search algorithm for index snp selection
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC476734/
https://www.ncbi.nlm.nih.gov/pubmed/15238162
http://dx.doi.org/10.1186/1471-2105-5-89
work_keys_str_mv AT zhangpeisen adoubleclassificationtreesearchalgorithmforindexsnpselection
AT shenghuitao adoubleclassificationtreesearchalgorithmforindexsnpselection
AT uehararyuhei adoubleclassificationtreesearchalgorithmforindexsnpselection
AT zhangpeisen doubleclassificationtreesearchalgorithmforindexsnpselection
AT shenghuitao doubleclassificationtreesearchalgorithmforindexsnpselection
AT uehararyuhei doubleclassificationtreesearchalgorithmforindexsnpselection