Cargando…

Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond

Data on hundreds or thousands of single nucleotide polymorphisms (SNPs) provide detailed information about the relationships between individuals, but currently few tools can turn this information into a multigenerational pedigree. I present the r package sequoia, which assigns parents, clusters half...

Descripción completa

Detalles Bibliográficos
Autor principal: Huisman, Jisca
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6849609/
https://www.ncbi.nlm.nih.gov/pubmed/28271620
http://dx.doi.org/10.1111/1755-0998.12665
_version_ 1783469243378958336
author Huisman, Jisca
author_facet Huisman, Jisca
author_sort Huisman, Jisca
collection PubMed
description Data on hundreds or thousands of single nucleotide polymorphisms (SNPs) provide detailed information about the relationships between individuals, but currently few tools can turn this information into a multigenerational pedigree. I present the r package sequoia, which assigns parents, clusters half‐siblings sharing an unsampled parent and assigns grandparents to half‐sibships. Assignments are made after consideration of the likelihoods of all possible first‐, second‐ and third‐degree relationships between the focal individuals, as well as the traditional alternative of being unrelated. This careful exploration of the local likelihood surface is implemented in a fast, heuristic hill‐climbing algorithm. Distinction between the various categories of second‐degree relatives is possible when likelihoods are calculated conditional on at least one parent of each focal individual. Performance was tested on simulated data sets with realistic genotyping error rate and missingness, based on three different large pedigrees (N = 1000–2000). This included a complex pedigree with overlapping generations, occasional close inbreeding and some unknown birth years. Parentage assignment was highly accurate down to about 100 independent SNPs (error rate <0.1%) and fast (<1 min) as most pairs can be excluded from being parent–offspring based on opposite homozygosity. For full pedigree reconstruction, 40% of parents were assumed nongenotyped. Reconstruction resulted in low error rates (<0.3%), high assignment rates (>99%) in limited computation time (typically <1 h) when at least 200 independent SNPs were used. In three empirical data sets, relatedness estimated from the inferred pedigree was strongly correlated to genomic relatedness.
format Online
Article
Text
id pubmed-6849609
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-68496092019-11-15 Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond Huisman, Jisca Mol Ecol Resour RESOURCE ARTICLES Data on hundreds or thousands of single nucleotide polymorphisms (SNPs) provide detailed information about the relationships between individuals, but currently few tools can turn this information into a multigenerational pedigree. I present the r package sequoia, which assigns parents, clusters half‐siblings sharing an unsampled parent and assigns grandparents to half‐sibships. Assignments are made after consideration of the likelihoods of all possible first‐, second‐ and third‐degree relationships between the focal individuals, as well as the traditional alternative of being unrelated. This careful exploration of the local likelihood surface is implemented in a fast, heuristic hill‐climbing algorithm. Distinction between the various categories of second‐degree relatives is possible when likelihoods are calculated conditional on at least one parent of each focal individual. Performance was tested on simulated data sets with realistic genotyping error rate and missingness, based on three different large pedigrees (N = 1000–2000). This included a complex pedigree with overlapping generations, occasional close inbreeding and some unknown birth years. Parentage assignment was highly accurate down to about 100 independent SNPs (error rate <0.1%) and fast (<1 min) as most pairs can be excluded from being parent–offspring based on opposite homozygosity. For full pedigree reconstruction, 40% of parents were assumed nongenotyped. Reconstruction resulted in low error rates (<0.3%), high assignment rates (>99%) in limited computation time (typically <1 h) when at least 200 independent SNPs were used. In three empirical data sets, relatedness estimated from the inferred pedigree was strongly correlated to genomic relatedness. John Wiley and Sons Inc. 2017-04-06 2017-09 /pmc/articles/PMC6849609/ /pubmed/28271620 http://dx.doi.org/10.1111/1755-0998.12665 Text en © 2017 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle RESOURCE ARTICLES
Huisman, Jisca
Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond
title Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond
title_full Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond
title_fullStr Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond
title_full_unstemmed Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond
title_short Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond
title_sort pedigree reconstruction from snp data: parentage assignment, sibship clustering and beyond
topic RESOURCE ARTICLES
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6849609/
https://www.ncbi.nlm.nih.gov/pubmed/28271620
http://dx.doi.org/10.1111/1755-0998.12665
work_keys_str_mv AT huismanjisca pedigreereconstructionfromsnpdataparentageassignmentsibshipclusteringandbeyond