Cargando…

Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)

BACKGROUND: Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact h...

Descripción completa

Detalles Bibliográficos
Autores principales: Martini, Johannes W. R., Gao, Ning, Cardoso, Diercles F., Wimmer, Valentin, Erbe, Malena, Cantet, Rodolfo J. C., Simianer, Henner
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209948/
https://www.ncbi.nlm.nih.gov/pubmed/28049412
http://dx.doi.org/10.1186/s12859-016-1439-1
_version_ 1782490827340120064
author Martini, Johannes W. R.
Gao, Ning
Cardoso, Diercles F.
Wimmer, Valentin
Erbe, Malena
Cantet, Rodolfo J. C.
Simianer, Henner
author_facet Martini, Johannes W. R.
Gao, Ning
Cardoso, Diercles F.
Wimmer, Valentin
Erbe, Malena
Cantet, Rodolfo J. C.
Simianer, Henner
author_sort Martini, Johannes W. R.
collection PubMed
description BACKGROUND: Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far. RESULTS: We illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits. CONCLUSION: Based on our results, for EGBLUP, a symmetric coding {−1,1} or {−1,0,1} should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1439-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5209948
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52099482017-01-04 Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE) Martini, Johannes W. R. Gao, Ning Cardoso, Diercles F. Wimmer, Valentin Erbe, Malena Cantet, Rodolfo J. C. Simianer, Henner BMC Bioinformatics Methodology Article BACKGROUND: Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far. RESULTS: We illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits. CONCLUSION: Based on our results, for EGBLUP, a symmetric coding {−1,1} or {−1,0,1} should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1439-1) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-03 /pmc/articles/PMC5209948/ /pubmed/28049412 http://dx.doi.org/10.1186/s12859-016-1439-1 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Martini, Johannes W. R.
Gao, Ning
Cardoso, Diercles F.
Wimmer, Valentin
Erbe, Malena
Cantet, Rodolfo J. C.
Simianer, Henner
Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)
title Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)
title_full Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)
title_fullStr Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)
title_full_unstemmed Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)
title_short Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)
title_sort genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended gblup and properties of the categorical epistasis model (ce)
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209948/
https://www.ncbi.nlm.nih.gov/pubmed/28049412
http://dx.doi.org/10.1186/s12859-016-1439-1
work_keys_str_mv AT martinijohanneswr genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce
AT gaoning genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce
AT cardosodierclesf genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce
AT wimmervalentin genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce
AT erbemalena genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce
AT cantetrodolfojc genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce
AT simianerhenner genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce