Cargando…
Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses
BACKGROUND: Sequencing data enable the detection of causal loci or single nucleotide polymorphisms (SNPs) highly linked to causal loci to improve genomic prediction. However, until now, studies on integrating such SNPs using a single-step genomic best linear unbiased prediction (ssGBLUP) model are s...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7429790/ https://www.ncbi.nlm.nih.gov/pubmed/32799816 http://dx.doi.org/10.1186/s12711-020-00568-0 |
_version_ | 1783571318406381568 |
---|---|
author | Liu, Aoxing Lund, Mogens Sandø Boichard, Didier Karaman, Emre Guldbrandtsen, Bernt Fritz, Sebastien Aamand, Gert Pedersen Nielsen, Ulrik Sander Sahana, Goutam Wang, Yachun Su, Guosheng |
author_facet | Liu, Aoxing Lund, Mogens Sandø Boichard, Didier Karaman, Emre Guldbrandtsen, Bernt Fritz, Sebastien Aamand, Gert Pedersen Nielsen, Ulrik Sander Sahana, Goutam Wang, Yachun Su, Guosheng |
author_sort | Liu, Aoxing |
collection | PubMed |
description | BACKGROUND: Sequencing data enable the detection of causal loci or single nucleotide polymorphisms (SNPs) highly linked to causal loci to improve genomic prediction. However, until now, studies on integrating such SNPs using a single-step genomic best linear unbiased prediction (ssGBLUP) model are scarce. We investigated the integration of sequencing SNPs selected by association (1262 SNPs) and bioinformatics (2359 SNPs) analyses into the currently used 54K-SNP chip, using three ssGBLUP models which make different assumptions on the distribution of SNP effects: a basic ssGBLUP model, a so-called featured ssGBLUP (ssFGBLUP) model that considered selected sequencing SNPs as a feature genetic component, and a weighted ssGBLUP (ssWGBLUP) model in which the genomic relationship matrix was weighted by the SNP variances estimated from a Bayesian whole-genome regression model, with every 1, 30, or 100 adjacent SNPs within a chromosome region sharing the same variance. We used data on milk production and female fertility in Danish Jersey. In total, 15,823 genotyped and 528,981 non-genotyped females born between 1990 and 2013 were used as reference population and 7415 genotyped females and 33,040 non-genotyped females born between 2014 and 2016 were used as validation population. RESULTS: With basic ssGBLUP, integrating SNPs selected from sequencing data improved prediction reliabilities for milk and protein yields, but resulted in limited or no improvement for fat yield and female fertility. Model performances depended on the SNP set used. When using ssWGBLUP with the 54K SNPs, reliabilities for milk and protein yields improved by 0.028 for genotyped animals and by 0.006 for non-genotyped animals compared with ssGBLUP. However, with the SNP set that included SNPs selected from sequencing data, no statistically significant difference in prediction reliability was observed between the three ssGBLUP models. CONCLUSIONS: In summary, when using 54K SNPs, a ssWGBLUP model with a common weight on the SNPs in a given region is a feasible approach for single-trait genetic evaluation. Integrating relevant SNPs selected from sequencing data into the standard SNP chip can improve the reliability of genomic prediction. Based on such SNP data, a basic ssGBLUP model was suggested since no significant improvement was observed from using alternative models such as ssWGBLUP and ssFGBLUP. |
format | Online Article Text |
id | pubmed-7429790 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-74297902020-08-18 Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses Liu, Aoxing Lund, Mogens Sandø Boichard, Didier Karaman, Emre Guldbrandtsen, Bernt Fritz, Sebastien Aamand, Gert Pedersen Nielsen, Ulrik Sander Sahana, Goutam Wang, Yachun Su, Guosheng Genet Sel Evol Research Article BACKGROUND: Sequencing data enable the detection of causal loci or single nucleotide polymorphisms (SNPs) highly linked to causal loci to improve genomic prediction. However, until now, studies on integrating such SNPs using a single-step genomic best linear unbiased prediction (ssGBLUP) model are scarce. We investigated the integration of sequencing SNPs selected by association (1262 SNPs) and bioinformatics (2359 SNPs) analyses into the currently used 54K-SNP chip, using three ssGBLUP models which make different assumptions on the distribution of SNP effects: a basic ssGBLUP model, a so-called featured ssGBLUP (ssFGBLUP) model that considered selected sequencing SNPs as a feature genetic component, and a weighted ssGBLUP (ssWGBLUP) model in which the genomic relationship matrix was weighted by the SNP variances estimated from a Bayesian whole-genome regression model, with every 1, 30, or 100 adjacent SNPs within a chromosome region sharing the same variance. We used data on milk production and female fertility in Danish Jersey. In total, 15,823 genotyped and 528,981 non-genotyped females born between 1990 and 2013 were used as reference population and 7415 genotyped females and 33,040 non-genotyped females born between 2014 and 2016 were used as validation population. RESULTS: With basic ssGBLUP, integrating SNPs selected from sequencing data improved prediction reliabilities for milk and protein yields, but resulted in limited or no improvement for fat yield and female fertility. Model performances depended on the SNP set used. When using ssWGBLUP with the 54K SNPs, reliabilities for milk and protein yields improved by 0.028 for genotyped animals and by 0.006 for non-genotyped animals compared with ssGBLUP. However, with the SNP set that included SNPs selected from sequencing data, no statistically significant difference in prediction reliability was observed between the three ssGBLUP models. CONCLUSIONS: In summary, when using 54K SNPs, a ssWGBLUP model with a common weight on the SNPs in a given region is a feasible approach for single-trait genetic evaluation. Integrating relevant SNPs selected from sequencing data into the standard SNP chip can improve the reliability of genomic prediction. Based on such SNP data, a basic ssGBLUP model was suggested since no significant improvement was observed from using alternative models such as ssWGBLUP and ssFGBLUP. BioMed Central 2020-08-14 /pmc/articles/PMC7429790/ /pubmed/32799816 http://dx.doi.org/10.1186/s12711-020-00568-0 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Liu, Aoxing Lund, Mogens Sandø Boichard, Didier Karaman, Emre Guldbrandtsen, Bernt Fritz, Sebastien Aamand, Gert Pedersen Nielsen, Ulrik Sander Sahana, Goutam Wang, Yachun Su, Guosheng Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses |
title | Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses |
title_full | Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses |
title_fullStr | Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses |
title_full_unstemmed | Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses |
title_short | Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses |
title_sort | weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7429790/ https://www.ncbi.nlm.nih.gov/pubmed/32799816 http://dx.doi.org/10.1186/s12711-020-00568-0 |
work_keys_str_mv | AT liuaoxing weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT lundmogenssandø weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT boicharddidier weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT karamanemre weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT guldbrandtsenbernt weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT fritzsebastien weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT aamandgertpedersen weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT nielsenulriksander weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT sahanagoutam weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT wangyachun weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses AT suguosheng weightedsinglestepgenomicbestlinearunbiasedpredictionintegratingvariantsselectedfromsequencingdatabyassociationandbioinformaticsanalyses |