Cargando…

SNiPer: Improved SNP genotype calling for Affymetrix 10K GeneChip microarray data

BACKGROUND: High throughput microarray-based single nucleotide polymorphism (SNP) genotyping has revolutionized the way genome-wide linkage scans and association analyses are performed. One of the key features of the array-based GeneChip(® )Mapping 10K Array from Affymetrix is the automated SNP call...

Descripción completa

Detalles Bibliográficos
Autores principales: Huentelman, Matthew J, Craig, David W, Shieh, Albert D, Corneveaux, Jason J, Hu-Lince, Diane, Pearson, John V, Stephan, Dietrich A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1280925/
https://www.ncbi.nlm.nih.gov/pubmed/16262895
http://dx.doi.org/10.1186/1471-2164-6-149
_version_ 1782126114332737536
author Huentelman, Matthew J
Craig, David W
Shieh, Albert D
Corneveaux, Jason J
Hu-Lince, Diane
Pearson, John V
Stephan, Dietrich A
author_facet Huentelman, Matthew J
Craig, David W
Shieh, Albert D
Corneveaux, Jason J
Hu-Lince, Diane
Pearson, John V
Stephan, Dietrich A
author_sort Huentelman, Matthew J
collection PubMed
description BACKGROUND: High throughput microarray-based single nucleotide polymorphism (SNP) genotyping has revolutionized the way genome-wide linkage scans and association analyses are performed. One of the key features of the array-based GeneChip(® )Mapping 10K Array from Affymetrix is the automated SNP calling algorithm. The Affymetrix algorithm was trained on a database of ethnically diverse DNA samples to create SNP call zones that are used as static models to make genotype calls for experimental data. We describe here the implementation of clustering algorithms on large training datasets resulting in improved SNP call rates on the 10K GeneChip. RESULTS: A database of 948 individuals genotyped on the GeneChip(® )Mapping 10K 2.0 Array was used to identify 822 SNPs that were called consistently less than 75% of the time. These SNPs represent on average 8.25% of the total SNPs on each chromosome with chromosome 19, the most gene-rich chromosome, containing the highest proportion of poor performers (18.7%). To remedy this, we created SNiPer, a new application which uses two clustering algorithms to yield increased call rates and equivalent concordance to Affymetrix called genotypes. We include a training set for these algorithms based on individual genotypes for 705 samples. SNiPer has the capability to be retrained for lab-specific training sets. SNiPer is freely available for download at . CONCLUSION: The correct calling of poor performing SNPs may prove to be key in future linkage studies performed on the 10K GeneChip. It would prove particularly invaluable for those diseases that map to chromosome 19, known to contain a high proportion of poorly performing SNPs. Our results illustrate that SNiPer can be used to increase call rates on the 10K GeneChip(® )without sacrificing accuracy, thereby increasing the amount of valid data generated.
format Text
id pubmed-1280925
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-12809252005-11-10 SNiPer: Improved SNP genotype calling for Affymetrix 10K GeneChip microarray data Huentelman, Matthew J Craig, David W Shieh, Albert D Corneveaux, Jason J Hu-Lince, Diane Pearson, John V Stephan, Dietrich A BMC Genomics Research Article BACKGROUND: High throughput microarray-based single nucleotide polymorphism (SNP) genotyping has revolutionized the way genome-wide linkage scans and association analyses are performed. One of the key features of the array-based GeneChip(® )Mapping 10K Array from Affymetrix is the automated SNP calling algorithm. The Affymetrix algorithm was trained on a database of ethnically diverse DNA samples to create SNP call zones that are used as static models to make genotype calls for experimental data. We describe here the implementation of clustering algorithms on large training datasets resulting in improved SNP call rates on the 10K GeneChip. RESULTS: A database of 948 individuals genotyped on the GeneChip(® )Mapping 10K 2.0 Array was used to identify 822 SNPs that were called consistently less than 75% of the time. These SNPs represent on average 8.25% of the total SNPs on each chromosome with chromosome 19, the most gene-rich chromosome, containing the highest proportion of poor performers (18.7%). To remedy this, we created SNiPer, a new application which uses two clustering algorithms to yield increased call rates and equivalent concordance to Affymetrix called genotypes. We include a training set for these algorithms based on individual genotypes for 705 samples. SNiPer has the capability to be retrained for lab-specific training sets. SNiPer is freely available for download at . CONCLUSION: The correct calling of poor performing SNPs may prove to be key in future linkage studies performed on the 10K GeneChip. It would prove particularly invaluable for those diseases that map to chromosome 19, known to contain a high proportion of poorly performing SNPs. Our results illustrate that SNiPer can be used to increase call rates on the 10K GeneChip(® )without sacrificing accuracy, thereby increasing the amount of valid data generated. BioMed Central 2005-10-31 /pmc/articles/PMC1280925/ /pubmed/16262895 http://dx.doi.org/10.1186/1471-2164-6-149 Text en Copyright © 2005 Huentelman et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Huentelman, Matthew J
Craig, David W
Shieh, Albert D
Corneveaux, Jason J
Hu-Lince, Diane
Pearson, John V
Stephan, Dietrich A
SNiPer: Improved SNP genotype calling for Affymetrix 10K GeneChip microarray data
title SNiPer: Improved SNP genotype calling for Affymetrix 10K GeneChip microarray data
title_full SNiPer: Improved SNP genotype calling for Affymetrix 10K GeneChip microarray data
title_fullStr SNiPer: Improved SNP genotype calling for Affymetrix 10K GeneChip microarray data
title_full_unstemmed SNiPer: Improved SNP genotype calling for Affymetrix 10K GeneChip microarray data
title_short SNiPer: Improved SNP genotype calling for Affymetrix 10K GeneChip microarray data
title_sort sniper: improved snp genotype calling for affymetrix 10k genechip microarray data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1280925/
https://www.ncbi.nlm.nih.gov/pubmed/16262895
http://dx.doi.org/10.1186/1471-2164-6-149
work_keys_str_mv AT huentelmanmatthewj sniperimprovedsnpgenotypecallingforaffymetrix10kgenechipmicroarraydata
AT craigdavidw sniperimprovedsnpgenotypecallingforaffymetrix10kgenechipmicroarraydata
AT shiehalbertd sniperimprovedsnpgenotypecallingforaffymetrix10kgenechipmicroarraydata
AT corneveauxjasonj sniperimprovedsnpgenotypecallingforaffymetrix10kgenechipmicroarraydata
AT hulincediane sniperimprovedsnpgenotypecallingforaffymetrix10kgenechipmicroarraydata
AT pearsonjohnv sniperimprovedsnpgenotypecallingforaffymetrix10kgenechipmicroarraydata
AT stephandietricha sniperimprovedsnpgenotypecallingforaffymetrix10kgenechipmicroarraydata