Cargando…

Construction of relatedness matrices using genotyping-by-sequencing data

BACKGROUND: Genotyping-by-sequencing (GBS) is becoming an attractive alternative to array-based methods for genotyping individuals for a large number of single nucleotide polymorphisms (SNPs). Costs can be lowered by reducing the mean sequencing depth, but this results in genotype calls of lower qua...

Descripción completa

Detalles Bibliográficos
Autores principales: Dodds, Ken G., McEwan, John C., Brauning, Rudiger, Anderson, Rayna M., van Stijn, Tracey C., Kristjánsson, Theodor, Clarke, Shannon M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4675043/
https://www.ncbi.nlm.nih.gov/pubmed/26654230
http://dx.doi.org/10.1186/s12864-015-2252-3
_version_ 1782405002654908416
author Dodds, Ken G.
McEwan, John C.
Brauning, Rudiger
Anderson, Rayna M.
van Stijn, Tracey C.
Kristjánsson, Theodor
Clarke, Shannon M.
author_facet Dodds, Ken G.
McEwan, John C.
Brauning, Rudiger
Anderson, Rayna M.
van Stijn, Tracey C.
Kristjánsson, Theodor
Clarke, Shannon M.
author_sort Dodds, Ken G.
collection PubMed
description BACKGROUND: Genotyping-by-sequencing (GBS) is becoming an attractive alternative to array-based methods for genotyping individuals for a large number of single nucleotide polymorphisms (SNPs). Costs can be lowered by reducing the mean sequencing depth, but this results in genotype calls of lower quality. A common analysis strategy is to filter SNPs to just those with sufficient depth, thereby greatly reducing the number of SNPs available. We investigate methods for estimating relatedness using GBS data, including results of low depth, using theoretical calculation, simulation and application to a real data set. RESULTS: We show that unbiased estimates of relatedness can be obtained by using only those SNPs with genotype calls in both individuals. The expected value of this estimator is independent of the SNP depth in each individual, under a model of genotype calling that includes the special case of the two alleles being read at random. In contrast, the estimator of self-relatedness does depend on the SNP depth, and we provide a modification to provide unbiased estimates of self-relatedness. We refer to these methods of estimation as kinship using GBS with depth adjustment (KGD). The estimators can be calculated using matrix methods, which allow efficient computation. Simulation results were consistent with the methods being unbiased, and suggest that the optimal sequencing depth is around 2–4 for relatedness between individuals and 5–10 for self-relatedness. Application to a real data set revealed that some SNP filtering may still be necessary, for the exclusion of SNPs which did not behave in a Mendelian fashion. A simple graphical method (a ‘fin plot’) is given to illustrate this issue and to guide filtering parameters. CONCLUSION: We provide a method which gives unbiased estimates of relatedness, based on SNPs assayed by GBS, which accounts for the depth (including zero depth) of the genotype calls. This allows GBS to be applied at read depths which can be chosen to optimise the information obtained. SNPs with excess heterozygosity, often due to (partial) polyploidy or other duplications can be filtered based on a simple graphical method. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2252-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4675043
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46750432015-12-11 Construction of relatedness matrices using genotyping-by-sequencing data Dodds, Ken G. McEwan, John C. Brauning, Rudiger Anderson, Rayna M. van Stijn, Tracey C. Kristjánsson, Theodor Clarke, Shannon M. BMC Genomics Methodology Article BACKGROUND: Genotyping-by-sequencing (GBS) is becoming an attractive alternative to array-based methods for genotyping individuals for a large number of single nucleotide polymorphisms (SNPs). Costs can be lowered by reducing the mean sequencing depth, but this results in genotype calls of lower quality. A common analysis strategy is to filter SNPs to just those with sufficient depth, thereby greatly reducing the number of SNPs available. We investigate methods for estimating relatedness using GBS data, including results of low depth, using theoretical calculation, simulation and application to a real data set. RESULTS: We show that unbiased estimates of relatedness can be obtained by using only those SNPs with genotype calls in both individuals. The expected value of this estimator is independent of the SNP depth in each individual, under a model of genotype calling that includes the special case of the two alleles being read at random. In contrast, the estimator of self-relatedness does depend on the SNP depth, and we provide a modification to provide unbiased estimates of self-relatedness. We refer to these methods of estimation as kinship using GBS with depth adjustment (KGD). The estimators can be calculated using matrix methods, which allow efficient computation. Simulation results were consistent with the methods being unbiased, and suggest that the optimal sequencing depth is around 2–4 for relatedness between individuals and 5–10 for self-relatedness. Application to a real data set revealed that some SNP filtering may still be necessary, for the exclusion of SNPs which did not behave in a Mendelian fashion. A simple graphical method (a ‘fin plot’) is given to illustrate this issue and to guide filtering parameters. CONCLUSION: We provide a method which gives unbiased estimates of relatedness, based on SNPs assayed by GBS, which accounts for the depth (including zero depth) of the genotype calls. This allows GBS to be applied at read depths which can be chosen to optimise the information obtained. SNPs with excess heterozygosity, often due to (partial) polyploidy or other duplications can be filtered based on a simple graphical method. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2252-3) contains supplementary material, which is available to authorized users. BioMed Central 2015-12-09 /pmc/articles/PMC4675043/ /pubmed/26654230 http://dx.doi.org/10.1186/s12864-015-2252-3 Text en © Dodds et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Dodds, Ken G.
McEwan, John C.
Brauning, Rudiger
Anderson, Rayna M.
van Stijn, Tracey C.
Kristjánsson, Theodor
Clarke, Shannon M.
Construction of relatedness matrices using genotyping-by-sequencing data
title Construction of relatedness matrices using genotyping-by-sequencing data
title_full Construction of relatedness matrices using genotyping-by-sequencing data
title_fullStr Construction of relatedness matrices using genotyping-by-sequencing data
title_full_unstemmed Construction of relatedness matrices using genotyping-by-sequencing data
title_short Construction of relatedness matrices using genotyping-by-sequencing data
title_sort construction of relatedness matrices using genotyping-by-sequencing data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4675043/
https://www.ncbi.nlm.nih.gov/pubmed/26654230
http://dx.doi.org/10.1186/s12864-015-2252-3
work_keys_str_mv AT doddskeng constructionofrelatednessmatricesusinggenotypingbysequencingdata
AT mcewanjohnc constructionofrelatednessmatricesusinggenotypingbysequencingdata
AT brauningrudiger constructionofrelatednessmatricesusinggenotypingbysequencingdata
AT andersonraynam constructionofrelatednessmatricesusinggenotypingbysequencingdata
AT vanstijntraceyc constructionofrelatednessmatricesusinggenotypingbysequencingdata
AT kristjanssontheodor constructionofrelatednessmatricesusinggenotypingbysequencingdata
AT clarkeshannonm constructionofrelatednessmatricesusinggenotypingbysequencingdata