Cargando…

Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction

BACKGROUND: Traditional genomic prediction models using multiple regression on single nucleotide polymorphisms (SNPs) genotypes exploit associations between genotypes of quantitative trait loci (QTL) and SNPs, which can be created by historical linkage disequilibrium (LD), recent co-segregation (CS)...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Xiaochen, Fernando, Rohan, Dekkers, Jack
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5060012/
https://www.ncbi.nlm.nih.gov/pubmed/27729012
http://dx.doi.org/10.1186/s12711-016-0255-4
_version_ 1782459523199401984
author Sun, Xiaochen
Fernando, Rohan
Dekkers, Jack
author_facet Sun, Xiaochen
Fernando, Rohan
Dekkers, Jack
author_sort Sun, Xiaochen
collection PubMed
description BACKGROUND: Traditional genomic prediction models using multiple regression on single nucleotide polymorphisms (SNPs) genotypes exploit associations between genotypes of quantitative trait loci (QTL) and SNPs, which can be created by historical linkage disequilibrium (LD), recent co-segregation (CS) and pedigree relationships. Results from field data analyses show that prediction accuracy is usually much higher for individuals that are close relatives of the training population than for distantly related individuals. A possible reason is that historical LD between QTL and SNPs is weak and, for close relatives, prediction accuracy of SNP models is mainly contributed by pedigree relationships and CS. Information from pedigree relationships decreases fast over generations and only contributes to within-family prediction. Information from CS is affected by family structures and effective population size, and can have a substantial contribution to prediction accuracy when modeled explicitly. RESULTS: In this study, a method to explicitly model CS was developed by following the transmission of putative QTL alleles using allele origins at SNPs. Bayesian hierarchical models that combine information from LD and CS (LD-CS model) were developed for genomic prediction in pedigree populations. Contributions of LD and CS information to prediction accuracy across families and generations without retraining were investigated in simulated half-sib datasets and deep pedigrees with different recent effective population sizes, respectively. Results from half-sib datasets showed that when historical LD between QTL and SNPs is low, accuracy of the LD model decreased when the training data size is increased by adding independent sire families, but accuracies from the CS and LD-CS models increased and plateaued rapidly. Results from deep pedigree datasets show that the LD model had high accuracy across generations only when historical LD between QTL and SNPs was high. Modeling CS explicitly resulted in higher accuracy than the LD model across generations when the mating design generated many close relatives. CONCLUSIONS: Our results suggest that modeling CS explicitly improves accuracy of genomic prediction when historical LD between QTL and SNPs is low. Modeling both LD and CS explicitly is expected to improve accuracy when recent effective population size is small, or when the training data include many independent families. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12711-016-0255-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5060012
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50600122016-10-17 Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction Sun, Xiaochen Fernando, Rohan Dekkers, Jack Genet Sel Evol Research Article BACKGROUND: Traditional genomic prediction models using multiple regression on single nucleotide polymorphisms (SNPs) genotypes exploit associations between genotypes of quantitative trait loci (QTL) and SNPs, which can be created by historical linkage disequilibrium (LD), recent co-segregation (CS) and pedigree relationships. Results from field data analyses show that prediction accuracy is usually much higher for individuals that are close relatives of the training population than for distantly related individuals. A possible reason is that historical LD between QTL and SNPs is weak and, for close relatives, prediction accuracy of SNP models is mainly contributed by pedigree relationships and CS. Information from pedigree relationships decreases fast over generations and only contributes to within-family prediction. Information from CS is affected by family structures and effective population size, and can have a substantial contribution to prediction accuracy when modeled explicitly. RESULTS: In this study, a method to explicitly model CS was developed by following the transmission of putative QTL alleles using allele origins at SNPs. Bayesian hierarchical models that combine information from LD and CS (LD-CS model) were developed for genomic prediction in pedigree populations. Contributions of LD and CS information to prediction accuracy across families and generations without retraining were investigated in simulated half-sib datasets and deep pedigrees with different recent effective population sizes, respectively. Results from half-sib datasets showed that when historical LD between QTL and SNPs is low, accuracy of the LD model decreased when the training data size is increased by adding independent sire families, but accuracies from the CS and LD-CS models increased and plateaued rapidly. Results from deep pedigree datasets show that the LD model had high accuracy across generations only when historical LD between QTL and SNPs was high. Modeling CS explicitly resulted in higher accuracy than the LD model across generations when the mating design generated many close relatives. CONCLUSIONS: Our results suggest that modeling CS explicitly improves accuracy of genomic prediction when historical LD between QTL and SNPs is low. Modeling both LD and CS explicitly is expected to improve accuracy when recent effective population size is small, or when the training data include many independent families. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12711-016-0255-4) contains supplementary material, which is available to authorized users. BioMed Central 2016-10-11 /pmc/articles/PMC5060012/ /pubmed/27729012 http://dx.doi.org/10.1186/s12711-016-0255-4 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Sun, Xiaochen
Fernando, Rohan
Dekkers, Jack
Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction
title Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction
title_full Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction
title_fullStr Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction
title_full_unstemmed Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction
title_short Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction
title_sort contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5060012/
https://www.ncbi.nlm.nih.gov/pubmed/27729012
http://dx.doi.org/10.1186/s12711-016-0255-4
work_keys_str_mv AT sunxiaochen contributionsoflinkagedisequilibriumandcosegregationinformationtotheaccuracyofgenomicprediction
AT fernandorohan contributionsoflinkagedisequilibriumandcosegregationinformationtotheaccuracyofgenomicprediction
AT dekkersjack contributionsoflinkagedisequilibriumandcosegregationinformationtotheaccuracyofgenomicprediction