Cargando…
Parentage assignment with genotyping‐by‐sequencing data
In this paper, we evaluate using genotype‐by‐sequencing (GBS) data to perform parentage assignment in lieu of traditional array data. The use of GBS data raises two issues: First, for low‐coverage (e.g., <2×) GBS data, it may not be possible to call the genotype at many loci, a critical first ste...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6392119/ https://www.ncbi.nlm.nih.gov/pubmed/30548685 http://dx.doi.org/10.1111/jbg.12370 |
_version_ | 1783398415093202944 |
---|---|
author | Whalen, Andrew Gorjanc, Gregor Hickey, John M. |
author_facet | Whalen, Andrew Gorjanc, Gregor Hickey, John M. |
author_sort | Whalen, Andrew |
collection | PubMed |
description | In this paper, we evaluate using genotype‐by‐sequencing (GBS) data to perform parentage assignment in lieu of traditional array data. The use of GBS data raises two issues: First, for low‐coverage (e.g., <2×) GBS data, it may not be possible to call the genotype at many loci, a critical first step for detecting opposing homozygous markers. Second, the amount of sequencing coverage may vary across individuals, making it challenging to directly compare the likelihood scores between putative parents. To address these issues, we extend the probabilistic framework of Huisman (Molecular Ecology Resources, 2017, 17, 1009) and evaluate putative parents by comparing their (potentially noisy) genotypes to a series of proposal distributions. These distributions describe the expected genotype probabilities for the relatives of an individual. We assign putative parents as a parent if they are classified as a parent (as opposed to e.g., an unrelated individual), and if the assignment score passes a threshold. We evaluated this method on simulated data and found that (a) high‐coverage (>2×) GBS data performs similarly to array data and requires only a small number of markers to correctly assign parents and (b) low‐coverage GBS data (as low as 0.1×) can also be used, provided that it is obtained across a large number of markers. When analysing the low‐coverage GBS data, we also found a high number of false positives if the true parent is not contained within the list of candidate parents, but that this false positive rate can be greatly reduced by hand tuning the assignment threshold. We provide this parentage assignment method as a standalone program called AlphaAssign. |
format | Online Article Text |
id | pubmed-6392119 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-63921192019-03-07 Parentage assignment with genotyping‐by‐sequencing data Whalen, Andrew Gorjanc, Gregor Hickey, John M. J Anim Breed Genet Original Articles In this paper, we evaluate using genotype‐by‐sequencing (GBS) data to perform parentage assignment in lieu of traditional array data. The use of GBS data raises two issues: First, for low‐coverage (e.g., <2×) GBS data, it may not be possible to call the genotype at many loci, a critical first step for detecting opposing homozygous markers. Second, the amount of sequencing coverage may vary across individuals, making it challenging to directly compare the likelihood scores between putative parents. To address these issues, we extend the probabilistic framework of Huisman (Molecular Ecology Resources, 2017, 17, 1009) and evaluate putative parents by comparing their (potentially noisy) genotypes to a series of proposal distributions. These distributions describe the expected genotype probabilities for the relatives of an individual. We assign putative parents as a parent if they are classified as a parent (as opposed to e.g., an unrelated individual), and if the assignment score passes a threshold. We evaluated this method on simulated data and found that (a) high‐coverage (>2×) GBS data performs similarly to array data and requires only a small number of markers to correctly assign parents and (b) low‐coverage GBS data (as low as 0.1×) can also be used, provided that it is obtained across a large number of markers. When analysing the low‐coverage GBS data, we also found a high number of false positives if the true parent is not contained within the list of candidate parents, but that this false positive rate can be greatly reduced by hand tuning the assignment threshold. We provide this parentage assignment method as a standalone program called AlphaAssign. John Wiley and Sons Inc. 2018-12-13 2019-03 /pmc/articles/PMC6392119/ /pubmed/30548685 http://dx.doi.org/10.1111/jbg.12370 Text en © 2018 The Authors. Journal of Animal Breeding and Genetics Published by Blackwell Verlag GmbH This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Articles Whalen, Andrew Gorjanc, Gregor Hickey, John M. Parentage assignment with genotyping‐by‐sequencing data |
title | Parentage assignment with genotyping‐by‐sequencing data |
title_full | Parentage assignment with genotyping‐by‐sequencing data |
title_fullStr | Parentage assignment with genotyping‐by‐sequencing data |
title_full_unstemmed | Parentage assignment with genotyping‐by‐sequencing data |
title_short | Parentage assignment with genotyping‐by‐sequencing data |
title_sort | parentage assignment with genotyping‐by‐sequencing data |
topic | Original Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6392119/ https://www.ncbi.nlm.nih.gov/pubmed/30548685 http://dx.doi.org/10.1111/jbg.12370 |
work_keys_str_mv | AT whalenandrew parentageassignmentwithgenotypingbysequencingdata AT gorjancgregor parentageassignmentwithgenotypingbysequencingdata AT hickeyjohnm parentageassignmentwithgenotypingbysequencingdata |