Cargando…
Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias
BACKGROUND: High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities a...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3305361/ https://www.ncbi.nlm.nih.gov/pubmed/22260749 http://dx.doi.org/10.1186/1471-2164-13-34 |
_version_ | 1782227052506644480 |
---|---|
author | Didion, John P Yang, Hyuna Sheppard, Keith Fu, Chen-Ping McMillan, Leonard de Villena, Fernando Pardo-Manuel Churchill, Gary A |
author_facet | Didion, John P Yang, Hyuna Sheppard, Keith Fu, Chen-Ping McMillan, Leonard de Villena, Fernando Pardo-Manuel Churchill, Gary A |
author_sort | Didion, John P |
collection | PubMed |
description | BACKGROUND: High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. RESULTS: We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. CONCLUSION: The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses. |
format | Online Article Text |
id | pubmed-3305361 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-33053612012-03-16 Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias Didion, John P Yang, Hyuna Sheppard, Keith Fu, Chen-Ping McMillan, Leonard de Villena, Fernando Pardo-Manuel Churchill, Gary A BMC Genomics Methodology Article BACKGROUND: High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. RESULTS: We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. CONCLUSION: The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses. BioMed Central 2012-01-19 /pmc/articles/PMC3305361/ /pubmed/22260749 http://dx.doi.org/10.1186/1471-2164-13-34 Text en Copyright ©2012 Didion et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Didion, John P Yang, Hyuna Sheppard, Keith Fu, Chen-Ping McMillan, Leonard de Villena, Fernando Pardo-Manuel Churchill, Gary A Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias |
title | Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias |
title_full | Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias |
title_fullStr | Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias |
title_full_unstemmed | Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias |
title_short | Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias |
title_sort | discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3305361/ https://www.ncbi.nlm.nih.gov/pubmed/22260749 http://dx.doi.org/10.1186/1471-2164-13-34 |
work_keys_str_mv | AT didionjohnp discoveryofnovelvariantsingenotypingarraysimprovesgenotyperetentionandreducesascertainmentbias AT yanghyuna discoveryofnovelvariantsingenotypingarraysimprovesgenotyperetentionandreducesascertainmentbias AT sheppardkeith discoveryofnovelvariantsingenotypingarraysimprovesgenotyperetentionandreducesascertainmentbias AT fuchenping discoveryofnovelvariantsingenotypingarraysimprovesgenotyperetentionandreducesascertainmentbias AT mcmillanleonard discoveryofnovelvariantsingenotypingarraysimprovesgenotyperetentionandreducesascertainmentbias AT devillenafernandopardomanuel discoveryofnovelvariantsingenotypingarraysimprovesgenotyperetentionandreducesascertainmentbias AT churchillgarya discoveryofnovelvariantsingenotypingarraysimprovesgenotyperetentionandreducesascertainmentbias |