Cargando…

L(2,1)-norm regularized multivariate regression model with applications to genomic prediction

MOTIVATION: Genomic selection (GS) is currently deemed the most effective approach to speed up breeding of agricultural varieties. It has been recognized that consideration of multiple traits in GS can improve accuracy of prediction for traits of low heritability. However, since GS forgoes statistic...

Descripción completa

Detalles Bibliográficos
Autores principales: Mbebi, Alain J, Tong, Hao, Nikoloski, Zoran
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479665/
https://www.ncbi.nlm.nih.gov/pubmed/33774677
http://dx.doi.org/10.1093/bioinformatics/btab212
_version_ 1784576307948945408
author Mbebi, Alain J
Tong, Hao
Nikoloski, Zoran
author_facet Mbebi, Alain J
Tong, Hao
Nikoloski, Zoran
author_sort Mbebi, Alain J
collection PubMed
description MOTIVATION: Genomic selection (GS) is currently deemed the most effective approach to speed up breeding of agricultural varieties. It has been recognized that consideration of multiple traits in GS can improve accuracy of prediction for traits of low heritability. However, since GS forgoes statistical testing with the idea of improving predictions, it does not facilitate mechanistic understanding of the contribution of particular single nucleotide polymorphisms (SNP). RESULTS: Here, we propose a [Formula: see text]-norm regularized multivariate regression model and devise a fast and efficient iterative optimization algorithm, called [Formula: see text]-joint, applicable in multi-trait GS. The usage of the [Formula: see text]-norm facilitates variable selection in a penalized multivariate regression that considers the relation between individuals, when the number of SNPs is much larger than the number of individuals. The capacity for variable selection allows us to define master regulators that can be used in a multi-trait GS setting to dissect the genetic architecture of the analyzed traits. Our comparative analyses demonstrate that the proposed model is a favorable candidate compared to existing state-of-the-art approaches. Prediction and variable selection with datasets from Brassica napus, wheat and Arabidopsis thaliana diversity panels are conducted to further showcase the performance of the proposed model. AVAILABILITY AND IMPLEMENTATION: : The model is implemented using R programming language and the code is freely available from https://github.com/alainmbebi/L21-norm-GS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8479665
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84796652021-09-30 L(2,1)-norm regularized multivariate regression model with applications to genomic prediction Mbebi, Alain J Tong, Hao Nikoloski, Zoran Bioinformatics Original Papers MOTIVATION: Genomic selection (GS) is currently deemed the most effective approach to speed up breeding of agricultural varieties. It has been recognized that consideration of multiple traits in GS can improve accuracy of prediction for traits of low heritability. However, since GS forgoes statistical testing with the idea of improving predictions, it does not facilitate mechanistic understanding of the contribution of particular single nucleotide polymorphisms (SNP). RESULTS: Here, we propose a [Formula: see text]-norm regularized multivariate regression model and devise a fast and efficient iterative optimization algorithm, called [Formula: see text]-joint, applicable in multi-trait GS. The usage of the [Formula: see text]-norm facilitates variable selection in a penalized multivariate regression that considers the relation between individuals, when the number of SNPs is much larger than the number of individuals. The capacity for variable selection allows us to define master regulators that can be used in a multi-trait GS setting to dissect the genetic architecture of the analyzed traits. Our comparative analyses demonstrate that the proposed model is a favorable candidate compared to existing state-of-the-art approaches. Prediction and variable selection with datasets from Brassica napus, wheat and Arabidopsis thaliana diversity panels are conducted to further showcase the performance of the proposed model. AVAILABILITY AND IMPLEMENTATION: : The model is implemented using R programming language and the code is freely available from https://github.com/alainmbebi/L21-norm-GS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-03-28 /pmc/articles/PMC8479665/ /pubmed/33774677 http://dx.doi.org/10.1093/bioinformatics/btab212 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Mbebi, Alain J
Tong, Hao
Nikoloski, Zoran
L(2,1)-norm regularized multivariate regression model with applications to genomic prediction
title L(2,1)-norm regularized multivariate regression model with applications to genomic prediction
title_full L(2,1)-norm regularized multivariate regression model with applications to genomic prediction
title_fullStr L(2,1)-norm regularized multivariate regression model with applications to genomic prediction
title_full_unstemmed L(2,1)-norm regularized multivariate regression model with applications to genomic prediction
title_short L(2,1)-norm regularized multivariate regression model with applications to genomic prediction
title_sort l(2,1)-norm regularized multivariate regression model with applications to genomic prediction
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479665/
https://www.ncbi.nlm.nih.gov/pubmed/33774677
http://dx.doi.org/10.1093/bioinformatics/btab212
work_keys_str_mv AT mbebialainj l21normregularizedmultivariateregressionmodelwithapplicationstogenomicprediction
AT tonghao l21normregularizedmultivariateregressionmodelwithapplicationstogenomicprediction
AT nikoloskizoran l21normregularizedmultivariateregressionmodelwithapplicationstogenomicprediction