Cargando…

Parentage assignment with genotyping‐by‐sequencing data

In this paper, we evaluate using genotype‐by‐sequencing (GBS) data to perform parentage assignment in lieu of traditional array data. The use of GBS data raises two issues: First, for low‐coverage (e.g., <2×) GBS data, it may not be possible to call the genotype at many loci, a critical first ste...

Descripción completa

Detalles Bibliográficos
Autores principales: Whalen, Andrew, Gorjanc, Gregor, Hickey, John M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6392119/
https://www.ncbi.nlm.nih.gov/pubmed/30548685
http://dx.doi.org/10.1111/jbg.12370
_version_ 1783398415093202944
author Whalen, Andrew
Gorjanc, Gregor
Hickey, John M.
author_facet Whalen, Andrew
Gorjanc, Gregor
Hickey, John M.
author_sort Whalen, Andrew
collection PubMed
description In this paper, we evaluate using genotype‐by‐sequencing (GBS) data to perform parentage assignment in lieu of traditional array data. The use of GBS data raises two issues: First, for low‐coverage (e.g., <2×) GBS data, it may not be possible to call the genotype at many loci, a critical first step for detecting opposing homozygous markers. Second, the amount of sequencing coverage may vary across individuals, making it challenging to directly compare the likelihood scores between putative parents. To address these issues, we extend the probabilistic framework of Huisman (Molecular Ecology Resources, 2017, 17, 1009) and evaluate putative parents by comparing their (potentially noisy) genotypes to a series of proposal distributions. These distributions describe the expected genotype probabilities for the relatives of an individual. We assign putative parents as a parent if they are classified as a parent (as opposed to e.g., an unrelated individual), and if the assignment score passes a threshold. We evaluated this method on simulated data and found that (a) high‐coverage (>2×) GBS data performs similarly to array data and requires only a small number of markers to correctly assign parents and (b) low‐coverage GBS data (as low as 0.1×) can also be used, provided that it is obtained across a large number of markers. When analysing the low‐coverage GBS data, we also found a high number of false positives if the true parent is not contained within the list of candidate parents, but that this false positive rate can be greatly reduced by hand tuning the assignment threshold. We provide this parentage assignment method as a standalone program called AlphaAssign.
format Online
Article
Text
id pubmed-6392119
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-63921192019-03-07 Parentage assignment with genotyping‐by‐sequencing data Whalen, Andrew Gorjanc, Gregor Hickey, John M. J Anim Breed Genet Original Articles In this paper, we evaluate using genotype‐by‐sequencing (GBS) data to perform parentage assignment in lieu of traditional array data. The use of GBS data raises two issues: First, for low‐coverage (e.g., <2×) GBS data, it may not be possible to call the genotype at many loci, a critical first step for detecting opposing homozygous markers. Second, the amount of sequencing coverage may vary across individuals, making it challenging to directly compare the likelihood scores between putative parents. To address these issues, we extend the probabilistic framework of Huisman (Molecular Ecology Resources, 2017, 17, 1009) and evaluate putative parents by comparing their (potentially noisy) genotypes to a series of proposal distributions. These distributions describe the expected genotype probabilities for the relatives of an individual. We assign putative parents as a parent if they are classified as a parent (as opposed to e.g., an unrelated individual), and if the assignment score passes a threshold. We evaluated this method on simulated data and found that (a) high‐coverage (>2×) GBS data performs similarly to array data and requires only a small number of markers to correctly assign parents and (b) low‐coverage GBS data (as low as 0.1×) can also be used, provided that it is obtained across a large number of markers. When analysing the low‐coverage GBS data, we also found a high number of false positives if the true parent is not contained within the list of candidate parents, but that this false positive rate can be greatly reduced by hand tuning the assignment threshold. We provide this parentage assignment method as a standalone program called AlphaAssign. John Wiley and Sons Inc. 2018-12-13 2019-03 /pmc/articles/PMC6392119/ /pubmed/30548685 http://dx.doi.org/10.1111/jbg.12370 Text en © 2018 The Authors. Journal of Animal Breeding and Genetics Published by Blackwell Verlag GmbH This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Articles
Whalen, Andrew
Gorjanc, Gregor
Hickey, John M.
Parentage assignment with genotyping‐by‐sequencing data
title Parentage assignment with genotyping‐by‐sequencing data
title_full Parentage assignment with genotyping‐by‐sequencing data
title_fullStr Parentage assignment with genotyping‐by‐sequencing data
title_full_unstemmed Parentage assignment with genotyping‐by‐sequencing data
title_short Parentage assignment with genotyping‐by‐sequencing data
title_sort parentage assignment with genotyping‐by‐sequencing data
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6392119/
https://www.ncbi.nlm.nih.gov/pubmed/30548685
http://dx.doi.org/10.1111/jbg.12370
work_keys_str_mv AT whalenandrew parentageassignmentwithgenotypingbysequencingdata
AT gorjancgregor parentageassignmentwithgenotypingbysequencingdata
AT hickeyjohnm parentageassignmentwithgenotypingbysequencingdata