Cargando…

Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors

BACKGROUND: Runs of homozygosity are long, uninterrupted stretches of homozygous genotypes that enable reliable estimation of levels of inbreeding (i.e., autozygosity) based on high-throughput, chip-based single nucleotide polymorphism (SNP) genotypes. While the theoretical definition of runs of hom...

Descripción completa

Detalles Bibliográficos
Autores principales: Ferenčaković, Maja, Sölkner, Johann, Curik, Ino
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4176748/
https://www.ncbi.nlm.nih.gov/pubmed/24168655
http://dx.doi.org/10.1186/1297-9686-45-42
_version_ 1782336674517221376
author Ferenčaković, Maja
Sölkner, Johann
Curik, Ino
author_facet Ferenčaković, Maja
Sölkner, Johann
Curik, Ino
author_sort Ferenčaković, Maja
collection PubMed
description BACKGROUND: Runs of homozygosity are long, uninterrupted stretches of homozygous genotypes that enable reliable estimation of levels of inbreeding (i.e., autozygosity) based on high-throughput, chip-based single nucleotide polymorphism (SNP) genotypes. While the theoretical definition of runs of homozygosity is straightforward, their empirical identification depends on the type of SNP chip used to obtain the data and on a number of factors, including the number of heterozygous calls allowed to account for genotyping errors. We analyzed how SNP chip density and genotyping errors affect estimates of autozygosity based on runs of homozygosity in three cattle populations, using genotype data from an SNP chip with 777 972 SNPs and a 50 k chip. RESULTS: Data from the 50 k chip led to overestimation of the number of runs of homozygosity that are shorter than 4 Mb, since the analysis could not identify heterozygous SNPs that were present on the denser chip. Conversely, data from the denser chip led to underestimation of the number of runs of homozygosity that were longer than 8 Mb, unless the presence of a small number of heterozygous SNP genotypes was allowed within a run of homozygosity. CONCLUSIONS: We have shown that SNP chip density and genotyping errors introduce patterns of bias in the estimation of autozygosity based on runs of homozygosity. SNP chips with 50 000 to 60 000 markers are frequently available for livestock species and their information leads to a conservative prediction of autozygosity from runs of homozygosity longer than 4 Mb. Not allowing heterozygous SNP genotypes to be present in a homozygosity run, as has been advocated for human populations, is not adequate for livestock populations because they have much higher levels of autozygosity and therefore longer runs of homozygosity. When allowing a small number of heterozygous calls, current software does not differentiate between situations where these calls are adjacent and therefore indicative of an actual break of the run versus those where they are scattered across the length of the homozygous segment. Simple graphical tests that are used in this paper are a current, yet tedious solution.
format Online
Article
Text
id pubmed-4176748
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41767482014-09-28 Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors Ferenčaković, Maja Sölkner, Johann Curik, Ino Genet Sel Evol Research BACKGROUND: Runs of homozygosity are long, uninterrupted stretches of homozygous genotypes that enable reliable estimation of levels of inbreeding (i.e., autozygosity) based on high-throughput, chip-based single nucleotide polymorphism (SNP) genotypes. While the theoretical definition of runs of homozygosity is straightforward, their empirical identification depends on the type of SNP chip used to obtain the data and on a number of factors, including the number of heterozygous calls allowed to account for genotyping errors. We analyzed how SNP chip density and genotyping errors affect estimates of autozygosity based on runs of homozygosity in three cattle populations, using genotype data from an SNP chip with 777 972 SNPs and a 50 k chip. RESULTS: Data from the 50 k chip led to overestimation of the number of runs of homozygosity that are shorter than 4 Mb, since the analysis could not identify heterozygous SNPs that were present on the denser chip. Conversely, data from the denser chip led to underestimation of the number of runs of homozygosity that were longer than 8 Mb, unless the presence of a small number of heterozygous SNP genotypes was allowed within a run of homozygosity. CONCLUSIONS: We have shown that SNP chip density and genotyping errors introduce patterns of bias in the estimation of autozygosity based on runs of homozygosity. SNP chips with 50 000 to 60 000 markers are frequently available for livestock species and their information leads to a conservative prediction of autozygosity from runs of homozygosity longer than 4 Mb. Not allowing heterozygous SNP genotypes to be present in a homozygosity run, as has been advocated for human populations, is not adequate for livestock populations because they have much higher levels of autozygosity and therefore longer runs of homozygosity. When allowing a small number of heterozygous calls, current software does not differentiate between situations where these calls are adjacent and therefore indicative of an actual break of the run versus those where they are scattered across the length of the homozygous segment. Simple graphical tests that are used in this paper are a current, yet tedious solution. BioMed Central 2013-10-29 /pmc/articles/PMC4176748/ /pubmed/24168655 http://dx.doi.org/10.1186/1297-9686-45-42 Text en Copyright © 2013 Ferenčaković et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Ferenčaković, Maja
Sölkner, Johann
Curik, Ino
Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors
title Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors
title_full Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors
title_fullStr Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors
title_full_unstemmed Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors
title_short Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors
title_sort estimating autozygosity from high-throughput information: effects of snp density and genotyping errors
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4176748/
https://www.ncbi.nlm.nih.gov/pubmed/24168655
http://dx.doi.org/10.1186/1297-9686-45-42
work_keys_str_mv AT ferencakovicmaja estimatingautozygosityfromhighthroughputinformationeffectsofsnpdensityandgenotypingerrors
AT solknerjohann estimatingautozygosityfromhighthroughputinformationeffectsofsnpdensityandgenotypingerrors
AT curikino estimatingautozygosityfromhighthroughputinformationeffectsofsnpdensityandgenotypingerrors