Cargando…
Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors
BACKGROUND: Runs of homozygosity are long, uninterrupted stretches of homozygous genotypes that enable reliable estimation of levels of inbreeding (i.e., autozygosity) based on high-throughput, chip-based single nucleotide polymorphism (SNP) genotypes. While the theoretical definition of runs of hom...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4176748/ https://www.ncbi.nlm.nih.gov/pubmed/24168655 http://dx.doi.org/10.1186/1297-9686-45-42 |
_version_ | 1782336674517221376 |
---|---|
author | Ferenčaković, Maja Sölkner, Johann Curik, Ino |
author_facet | Ferenčaković, Maja Sölkner, Johann Curik, Ino |
author_sort | Ferenčaković, Maja |
collection | PubMed |
description | BACKGROUND: Runs of homozygosity are long, uninterrupted stretches of homozygous genotypes that enable reliable estimation of levels of inbreeding (i.e., autozygosity) based on high-throughput, chip-based single nucleotide polymorphism (SNP) genotypes. While the theoretical definition of runs of homozygosity is straightforward, their empirical identification depends on the type of SNP chip used to obtain the data and on a number of factors, including the number of heterozygous calls allowed to account for genotyping errors. We analyzed how SNP chip density and genotyping errors affect estimates of autozygosity based on runs of homozygosity in three cattle populations, using genotype data from an SNP chip with 777 972 SNPs and a 50 k chip. RESULTS: Data from the 50 k chip led to overestimation of the number of runs of homozygosity that are shorter than 4 Mb, since the analysis could not identify heterozygous SNPs that were present on the denser chip. Conversely, data from the denser chip led to underestimation of the number of runs of homozygosity that were longer than 8 Mb, unless the presence of a small number of heterozygous SNP genotypes was allowed within a run of homozygosity. CONCLUSIONS: We have shown that SNP chip density and genotyping errors introduce patterns of bias in the estimation of autozygosity based on runs of homozygosity. SNP chips with 50 000 to 60 000 markers are frequently available for livestock species and their information leads to a conservative prediction of autozygosity from runs of homozygosity longer than 4 Mb. Not allowing heterozygous SNP genotypes to be present in a homozygosity run, as has been advocated for human populations, is not adequate for livestock populations because they have much higher levels of autozygosity and therefore longer runs of homozygosity. When allowing a small number of heterozygous calls, current software does not differentiate between situations where these calls are adjacent and therefore indicative of an actual break of the run versus those where they are scattered across the length of the homozygous segment. Simple graphical tests that are used in this paper are a current, yet tedious solution. |
format | Online Article Text |
id | pubmed-4176748 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41767482014-09-28 Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors Ferenčaković, Maja Sölkner, Johann Curik, Ino Genet Sel Evol Research BACKGROUND: Runs of homozygosity are long, uninterrupted stretches of homozygous genotypes that enable reliable estimation of levels of inbreeding (i.e., autozygosity) based on high-throughput, chip-based single nucleotide polymorphism (SNP) genotypes. While the theoretical definition of runs of homozygosity is straightforward, their empirical identification depends on the type of SNP chip used to obtain the data and on a number of factors, including the number of heterozygous calls allowed to account for genotyping errors. We analyzed how SNP chip density and genotyping errors affect estimates of autozygosity based on runs of homozygosity in three cattle populations, using genotype data from an SNP chip with 777 972 SNPs and a 50 k chip. RESULTS: Data from the 50 k chip led to overestimation of the number of runs of homozygosity that are shorter than 4 Mb, since the analysis could not identify heterozygous SNPs that were present on the denser chip. Conversely, data from the denser chip led to underestimation of the number of runs of homozygosity that were longer than 8 Mb, unless the presence of a small number of heterozygous SNP genotypes was allowed within a run of homozygosity. CONCLUSIONS: We have shown that SNP chip density and genotyping errors introduce patterns of bias in the estimation of autozygosity based on runs of homozygosity. SNP chips with 50 000 to 60 000 markers are frequently available for livestock species and their information leads to a conservative prediction of autozygosity from runs of homozygosity longer than 4 Mb. Not allowing heterozygous SNP genotypes to be present in a homozygosity run, as has been advocated for human populations, is not adequate for livestock populations because they have much higher levels of autozygosity and therefore longer runs of homozygosity. When allowing a small number of heterozygous calls, current software does not differentiate between situations where these calls are adjacent and therefore indicative of an actual break of the run versus those where they are scattered across the length of the homozygous segment. Simple graphical tests that are used in this paper are a current, yet tedious solution. BioMed Central 2013-10-29 /pmc/articles/PMC4176748/ /pubmed/24168655 http://dx.doi.org/10.1186/1297-9686-45-42 Text en Copyright © 2013 Ferenčaković et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Ferenčaković, Maja Sölkner, Johann Curik, Ino Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors |
title | Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors |
title_full | Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors |
title_fullStr | Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors |
title_full_unstemmed | Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors |
title_short | Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors |
title_sort | estimating autozygosity from high-throughput information: effects of snp density and genotyping errors |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4176748/ https://www.ncbi.nlm.nih.gov/pubmed/24168655 http://dx.doi.org/10.1186/1297-9686-45-42 |
work_keys_str_mv | AT ferencakovicmaja estimatingautozygosityfromhighthroughputinformationeffectsofsnpdensityandgenotypingerrors AT solknerjohann estimatingautozygosityfromhighthroughputinformationeffectsofsnpdensityandgenotypingerrors AT curikino estimatingautozygosityfromhighthroughputinformationeffectsofsnpdensityandgenotypingerrors |