Cargando…
Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle
BACKGROUND: We investigated strategies and factors affecting accuracy of imputing genotypes from lower-density SNP panels (Illumina 3K, 7K, Affymetrix 15K and 25K, and evenly spaced subsets) up to one medium (Illumina 50K) and one high-density (Illumina 800K) SNP panel. We also evaluated the utility...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531262/ https://www.ncbi.nlm.nih.gov/pubmed/23043356 http://dx.doi.org/10.1186/1471-2164-13-538 |
_version_ | 1782254143568609280 |
---|---|
author | Khatkar, Mehar S Moser, Gerhard Hayes, Ben J Raadsma, Herman W |
author_facet | Khatkar, Mehar S Moser, Gerhard Hayes, Ben J Raadsma, Herman W |
author_sort | Khatkar, Mehar S |
collection | PubMed |
description | BACKGROUND: We investigated strategies and factors affecting accuracy of imputing genotypes from lower-density SNP panels (Illumina 3K, 7K, Affymetrix 15K and 25K, and evenly spaced subsets) up to one medium (Illumina 50K) and one high-density (Illumina 800K) SNP panel. We also evaluated the utility of imputed genotypes on the accuracy of genomic selection using Australian Holstein-Friesian cattle data from 2727 and 845 animals genotyped with 50K and 800K SNP chip, respectively. Animals were divided into reference and test sets (genotyped with higher and lower density SNP panels, respectively) for evaluating the accuracies of imputation. For the accuracy of genomic selection, a comparison of direct genetic values (DGV) was made by dividing the data into training and validation sets under a range of imputation scenarios. RESULTS: Of the three methods compared for imputation, IMPUTE2 outperformed Beagle and fastPhase for almost all scenarios. Higher SNP densities in the test animals, larger reference sets and higher relatedness between test and reference animals increased the accuracy of imputation. 50K specific genotypes were imputed with moderate allelic error rates from 15K (2.85%) and 25K (2.75%) genotypes. Using IMPUTE2, SNP genotypes up to 800K were imputed with low allelic error rate (0.79% genome-wide) from 50K genotypes, and with moderate error rate from 3K (4.78%) and 7K (2.00%) genotypes. The error rate of imputing up to 800K from 3K or 7K was further reduced when an additional middle tier of 50K genotypes was incorporated in a 3-tiered framework. Accuracies of DGV for five production traits using imputed 50K genotypes were close to those obtained with the actual 50K genotypes and higher compared to using 3K or 7K genotypes. The loss in accuracy of DGV was small when most of the training animals also had imputed (50K) genotypes. Additional gains in DGV accuracies were small when SNP densities increased from 50K to imputed 800K. CONCLUSION: Population-based genotype imputation can be used to predict and combine genotypes from different low, medium and high-density SNP chips with a high level of accuracy. Imputing genotypes from low-density SNP panels to at least 50K SNP density increases the accuracy of genomic selection. |
format | Online Article Text |
id | pubmed-3531262 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35312622013-01-10 Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle Khatkar, Mehar S Moser, Gerhard Hayes, Ben J Raadsma, Herman W BMC Genomics Research Article BACKGROUND: We investigated strategies and factors affecting accuracy of imputing genotypes from lower-density SNP panels (Illumina 3K, 7K, Affymetrix 15K and 25K, and evenly spaced subsets) up to one medium (Illumina 50K) and one high-density (Illumina 800K) SNP panel. We also evaluated the utility of imputed genotypes on the accuracy of genomic selection using Australian Holstein-Friesian cattle data from 2727 and 845 animals genotyped with 50K and 800K SNP chip, respectively. Animals were divided into reference and test sets (genotyped with higher and lower density SNP panels, respectively) for evaluating the accuracies of imputation. For the accuracy of genomic selection, a comparison of direct genetic values (DGV) was made by dividing the data into training and validation sets under a range of imputation scenarios. RESULTS: Of the three methods compared for imputation, IMPUTE2 outperformed Beagle and fastPhase for almost all scenarios. Higher SNP densities in the test animals, larger reference sets and higher relatedness between test and reference animals increased the accuracy of imputation. 50K specific genotypes were imputed with moderate allelic error rates from 15K (2.85%) and 25K (2.75%) genotypes. Using IMPUTE2, SNP genotypes up to 800K were imputed with low allelic error rate (0.79% genome-wide) from 50K genotypes, and with moderate error rate from 3K (4.78%) and 7K (2.00%) genotypes. The error rate of imputing up to 800K from 3K or 7K was further reduced when an additional middle tier of 50K genotypes was incorporated in a 3-tiered framework. Accuracies of DGV for five production traits using imputed 50K genotypes were close to those obtained with the actual 50K genotypes and higher compared to using 3K or 7K genotypes. The loss in accuracy of DGV was small when most of the training animals also had imputed (50K) genotypes. Additional gains in DGV accuracies were small when SNP densities increased from 50K to imputed 800K. CONCLUSION: Population-based genotype imputation can be used to predict and combine genotypes from different low, medium and high-density SNP chips with a high level of accuracy. Imputing genotypes from low-density SNP panels to at least 50K SNP density increases the accuracy of genomic selection. BioMed Central 2012-10-08 /pmc/articles/PMC3531262/ /pubmed/23043356 http://dx.doi.org/10.1186/1471-2164-13-538 Text en Copyright ©2012 Khatkar et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Khatkar, Mehar S Moser, Gerhard Hayes, Ben J Raadsma, Herman W Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle |
title | Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle |
title_full | Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle |
title_fullStr | Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle |
title_full_unstemmed | Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle |
title_short | Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle |
title_sort | strategies and utility of imputed snp genotypes for genomic analysis in dairy cattle |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531262/ https://www.ncbi.nlm.nih.gov/pubmed/23043356 http://dx.doi.org/10.1186/1471-2164-13-538 |
work_keys_str_mv | AT khatkarmehars strategiesandutilityofimputedsnpgenotypesforgenomicanalysisindairycattle AT mosergerhard strategiesandutilityofimputedsnpgenotypesforgenomicanalysisindairycattle AT hayesbenj strategiesandutilityofimputedsnpgenotypesforgenomicanalysisindairycattle AT raadsmahermanw strategiesandutilityofimputedsnpgenotypesforgenomicanalysisindairycattle |