Cargando…

Accounting for overlapping annotations in genomic prediction models of complex traits

BACKGROUND: It is now widespread in livestock and plant breeding to use genotyping data to predict phenotypes with genomic prediction models. In parallel, genomic annotations related to a variety of traits are increasing in number and granularity, providing valuable insight into potentially importan...

Descripción completa

Detalles Bibliográficos
Autores principales: Mollandin, Fanny, Gilbert, Hélène, Croiseau, Pascal, Rau, Andrea
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9446854/
https://www.ncbi.nlm.nih.gov/pubmed/36068513
http://dx.doi.org/10.1186/s12859-022-04914-5
_version_ 1784783731689521152
author Mollandin, Fanny
Gilbert, Hélène
Croiseau, Pascal
Rau, Andrea
author_facet Mollandin, Fanny
Gilbert, Hélène
Croiseau, Pascal
Rau, Andrea
author_sort Mollandin, Fanny
collection PubMed
description BACKGROUND: It is now widespread in livestock and plant breeding to use genotyping data to predict phenotypes with genomic prediction models. In parallel, genomic annotations related to a variety of traits are increasing in number and granularity, providing valuable insight into potentially important positions in the genome. The BayesRC model integrates this prior biological information by factorizing the genome according to disjoint annotation categories, in some cases enabling improved prediction of heritable traits. However, BayesRC is not adapted to cases where markers may have multiple annotations. RESULTS: We propose two novel Bayesian approaches to account for multi-annotated markers through a cumulative (BayesRC+) or preferential (BayesRC[Formula: see text] ) model of the contribution of multiple annotation categories. We illustrate their performance on simulated data with various genetic architectures and types of annotations. We also explore their use on data from a backcross population of growing pigs in conjunction with annotations constructed using the PigQTLdb. In both simulated and real data, we observed a modest improvement in prediction quality with our models when used with informative annotations. In addition, our results show that BayesRC+ successfully prioritizes multi-annotated markers according to their posterior variance, while BayesRC[Formula: see text] provides a useful interpretation of informative annotations for multi-annotated markers. Finally, we explore several strategies for constructing annotations from a public database, highlighting the importance of careful consideration of this step. CONCLUSION: When used with annotations that are relevant to the trait under study, BayesRC[Formula: see text] and BayesRC+ allow for improved prediction and prioritization of multi-annotated markers, and can provide useful biological insight into the genetic architecture of traits. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04914-5.
format Online
Article
Text
id pubmed-9446854
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-94468542022-09-07 Accounting for overlapping annotations in genomic prediction models of complex traits Mollandin, Fanny Gilbert, Hélène Croiseau, Pascal Rau, Andrea BMC Bioinformatics Research BACKGROUND: It is now widespread in livestock and plant breeding to use genotyping data to predict phenotypes with genomic prediction models. In parallel, genomic annotations related to a variety of traits are increasing in number and granularity, providing valuable insight into potentially important positions in the genome. The BayesRC model integrates this prior biological information by factorizing the genome according to disjoint annotation categories, in some cases enabling improved prediction of heritable traits. However, BayesRC is not adapted to cases where markers may have multiple annotations. RESULTS: We propose two novel Bayesian approaches to account for multi-annotated markers through a cumulative (BayesRC+) or preferential (BayesRC[Formula: see text] ) model of the contribution of multiple annotation categories. We illustrate their performance on simulated data with various genetic architectures and types of annotations. We also explore their use on data from a backcross population of growing pigs in conjunction with annotations constructed using the PigQTLdb. In both simulated and real data, we observed a modest improvement in prediction quality with our models when used with informative annotations. In addition, our results show that BayesRC+ successfully prioritizes multi-annotated markers according to their posterior variance, while BayesRC[Formula: see text] provides a useful interpretation of informative annotations for multi-annotated markers. Finally, we explore several strategies for constructing annotations from a public database, highlighting the importance of careful consideration of this step. CONCLUSION: When used with annotations that are relevant to the trait under study, BayesRC[Formula: see text] and BayesRC+ allow for improved prediction and prioritization of multi-annotated markers, and can provide useful biological insight into the genetic architecture of traits. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04914-5. BioMed Central 2022-09-06 /pmc/articles/PMC9446854/ /pubmed/36068513 http://dx.doi.org/10.1186/s12859-022-04914-5 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Mollandin, Fanny
Gilbert, Hélène
Croiseau, Pascal
Rau, Andrea
Accounting for overlapping annotations in genomic prediction models of complex traits
title Accounting for overlapping annotations in genomic prediction models of complex traits
title_full Accounting for overlapping annotations in genomic prediction models of complex traits
title_fullStr Accounting for overlapping annotations in genomic prediction models of complex traits
title_full_unstemmed Accounting for overlapping annotations in genomic prediction models of complex traits
title_short Accounting for overlapping annotations in genomic prediction models of complex traits
title_sort accounting for overlapping annotations in genomic prediction models of complex traits
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9446854/
https://www.ncbi.nlm.nih.gov/pubmed/36068513
http://dx.doi.org/10.1186/s12859-022-04914-5
work_keys_str_mv AT mollandinfanny accountingforoverlappingannotationsingenomicpredictionmodelsofcomplextraits
AT gilberthelene accountingforoverlappingannotationsingenomicpredictionmodelsofcomplextraits
AT croiseaupascal accountingforoverlappingannotationsingenomicpredictionmodelsofcomplextraits
AT rauandrea accountingforoverlappingannotationsingenomicpredictionmodelsofcomplextraits