Cargando…

‘apparent’: a simple and flexible R package for accurate SNP-based parentage analysis in the absence of guiding information

BACKGROUND: The accurate determination of parent-progeny relationships within both in situ natural populations and ex situ genetic resource collections can greatly enhance plant breeding/domestication efforts and support plant genetic resource conservation strategies. Although a range of parentage a...

Descripción completa

Detalles Bibliográficos
Autores principales: Melo, Arthur T. O., Hale, Iago
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6396488/
https://www.ncbi.nlm.nih.gov/pubmed/30819089
http://dx.doi.org/10.1186/s12859-019-2662-3
_version_ 1783399262835441664
author Melo, Arthur T. O.
Hale, Iago
author_facet Melo, Arthur T. O.
Hale, Iago
author_sort Melo, Arthur T. O.
collection PubMed
description BACKGROUND: The accurate determination of parent-progeny relationships within both in situ natural populations and ex situ genetic resource collections can greatly enhance plant breeding/domestication efforts and support plant genetic resource conservation strategies. Although a range of parentage analysis tools are available, none are designed to infer such relationships using genome-wide single nucleotide polymorphism (SNP) data in the complete absence of guiding information, such as generational groups, partial pedigrees, or genders. The R package (‘apparent’) developed and presented here addresses this gap. RESULTS: ‘apparent’ adopts a novel strategy of parentage analysis based on a test of genetic identity between a theoretically expected progeny (EP(ij)), whose genotypic state can be inferred at all homozygous loci for a pair of putative parents (i and j), and all potential offspring (PO(k)), represented by the k individuals of a given germplasm collection. Using the Gower Dissimilarity metric (GD), genetic identity between EP(ij) and PO(k) is taken as evidence that individuals i and j are the true parents of offspring k. Significance of a given triad (parental pair(ij) + offspring(k)) is evaluated relative to the distribution of all GD(ij|k) values for the population. With no guiding information provided, ‘apparent’ correctly identified the parental pairs of 15 lines of known pedigree within a test population of 77 accessions of Actinidia arguta, a performance unmatched by five other commonly used parentage analysis tools. In the case of an inconclusive triad analysis due to the absence of one parent from the test population, ‘apparent’ can perform a subsequent dyad analysis to identify a likely single parent for a given offspring. Average dyad analysis accuracy was 73.3% in the complete absence of pedigree information but increased to 100% when minimal generational information (adults vs. progeny) was provided. CONCLUSIONS: The ‘apparent’ R package is a fast and accurate parentage analysis tool that uses genome-wide SNP data to identify parent-progeny relationships within populations for which no a priori knowledge of family structure exists. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2662-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6396488
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63964882019-03-13 ‘apparent’: a simple and flexible R package for accurate SNP-based parentage analysis in the absence of guiding information Melo, Arthur T. O. Hale, Iago BMC Bioinformatics Software BACKGROUND: The accurate determination of parent-progeny relationships within both in situ natural populations and ex situ genetic resource collections can greatly enhance plant breeding/domestication efforts and support plant genetic resource conservation strategies. Although a range of parentage analysis tools are available, none are designed to infer such relationships using genome-wide single nucleotide polymorphism (SNP) data in the complete absence of guiding information, such as generational groups, partial pedigrees, or genders. The R package (‘apparent’) developed and presented here addresses this gap. RESULTS: ‘apparent’ adopts a novel strategy of parentage analysis based on a test of genetic identity between a theoretically expected progeny (EP(ij)), whose genotypic state can be inferred at all homozygous loci for a pair of putative parents (i and j), and all potential offspring (PO(k)), represented by the k individuals of a given germplasm collection. Using the Gower Dissimilarity metric (GD), genetic identity between EP(ij) and PO(k) is taken as evidence that individuals i and j are the true parents of offspring k. Significance of a given triad (parental pair(ij) + offspring(k)) is evaluated relative to the distribution of all GD(ij|k) values for the population. With no guiding information provided, ‘apparent’ correctly identified the parental pairs of 15 lines of known pedigree within a test population of 77 accessions of Actinidia arguta, a performance unmatched by five other commonly used parentage analysis tools. In the case of an inconclusive triad analysis due to the absence of one parent from the test population, ‘apparent’ can perform a subsequent dyad analysis to identify a likely single parent for a given offspring. Average dyad analysis accuracy was 73.3% in the complete absence of pedigree information but increased to 100% when minimal generational information (adults vs. progeny) was provided. CONCLUSIONS: The ‘apparent’ R package is a fast and accurate parentage analysis tool that uses genome-wide SNP data to identify parent-progeny relationships within populations for which no a priori knowledge of family structure exists. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2662-3) contains supplementary material, which is available to authorized users. BioMed Central 2019-02-28 /pmc/articles/PMC6396488/ /pubmed/30819089 http://dx.doi.org/10.1186/s12859-019-2662-3 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Melo, Arthur T. O.
Hale, Iago
‘apparent’: a simple and flexible R package for accurate SNP-based parentage analysis in the absence of guiding information
title ‘apparent’: a simple and flexible R package for accurate SNP-based parentage analysis in the absence of guiding information
title_full ‘apparent’: a simple and flexible R package for accurate SNP-based parentage analysis in the absence of guiding information
title_fullStr ‘apparent’: a simple and flexible R package for accurate SNP-based parentage analysis in the absence of guiding information
title_full_unstemmed ‘apparent’: a simple and flexible R package for accurate SNP-based parentage analysis in the absence of guiding information
title_short ‘apparent’: a simple and flexible R package for accurate SNP-based parentage analysis in the absence of guiding information
title_sort ‘apparent’: a simple and flexible r package for accurate snp-based parentage analysis in the absence of guiding information
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6396488/
https://www.ncbi.nlm.nih.gov/pubmed/30819089
http://dx.doi.org/10.1186/s12859-019-2662-3
work_keys_str_mv AT meloarthurto apparentasimpleandflexiblerpackageforaccuratesnpbasedparentageanalysisintheabsenceofguidinginformation
AT haleiago apparentasimpleandflexiblerpackageforaccuratesnpbasedparentageanalysisintheabsenceofguidinginformation