Cargando…

Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models

A variety of statistical methods, such as admixture models, have been used to estimate genomic breed composition (GBC). These methods, however, tend to produce non-zero components to reference breeds that shared some genomic similarity with a test animal. These non-essential GBC components, in turn,...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yangfan, Wu, Xiao-Lin, Li, Zhi, Bao, Zhenmin, Tait, Richard G., Bauck, Stewart, Rosa, Guilherme J. M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7300184/
https://www.ncbi.nlm.nih.gov/pubmed/32595700
http://dx.doi.org/10.3389/fgene.2020.00576
_version_ 1783547535034417152
author Wang, Yangfan
Wu, Xiao-Lin
Li, Zhi
Bao, Zhenmin
Tait, Richard G.
Bauck, Stewart
Rosa, Guilherme J. M.
author_facet Wang, Yangfan
Wu, Xiao-Lin
Li, Zhi
Bao, Zhenmin
Tait, Richard G.
Bauck, Stewart
Rosa, Guilherme J. M.
author_sort Wang, Yangfan
collection PubMed
description A variety of statistical methods, such as admixture models, have been used to estimate genomic breed composition (GBC). These methods, however, tend to produce non-zero components to reference breeds that shared some genomic similarity with a test animal. These non-essential GBC components, in turn, offset the estimated GBC for the breed to which it belongs. As a result, not all purebred animals have 100% GBC of their respective breeds, which statistically indicates an elevated false-negative rate in the identification of purebred animals with 100% GBC as the cutoff. Otherwise, a lower cutoff of estimated GBC will have to be used, which is arbitrary, and the results are less interpretable. In the present study, three admixture models with regularization were proposed, which produced sparse solutions through suppressing the noise in the estimated GBC due to genomic similarities. The regularization or penalty forms included the L1 norm penalty, minimax concave penalty (MCP), and smooth clipped absolute deviation (SCAD). The performances of these regularized admixture models on the estimation of GBC were examined in purebred and composite animals, respectively, and compared to that of the non-regularized admixture model as the baseline model. The results showed that, given optimal values for λ, the three sparsely regularized admixture models had higher power and thus reduced the false-negative rate for the breed identification of purebred animals than the non-regularized admixture model. Of the three regularized admixture models, the two with a non-convex penalty outperformed the one with L1 norm penalty. In the Brangus, a composite cattle breed, estimated GBC were roughly comparable among the four admixture models, but all the four models underestimated the GBC for these composite animals when non-ancestral breeds were included as the reference. In conclusion, the admixture models with sparse regularization gave more parsimonious, consistent and interpretable results of estimated GBC for purebred animals than the non-regularized admixture model. Nevertheless, the utility of regularized admixture models for estimating GBC in crossbred or composite animals needs to be taken with caution.
format Online
Article
Text
id pubmed-7300184
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-73001842020-06-26 Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models Wang, Yangfan Wu, Xiao-Lin Li, Zhi Bao, Zhenmin Tait, Richard G. Bauck, Stewart Rosa, Guilherme J. M. Front Genet Genetics A variety of statistical methods, such as admixture models, have been used to estimate genomic breed composition (GBC). These methods, however, tend to produce non-zero components to reference breeds that shared some genomic similarity with a test animal. These non-essential GBC components, in turn, offset the estimated GBC for the breed to which it belongs. As a result, not all purebred animals have 100% GBC of their respective breeds, which statistically indicates an elevated false-negative rate in the identification of purebred animals with 100% GBC as the cutoff. Otherwise, a lower cutoff of estimated GBC will have to be used, which is arbitrary, and the results are less interpretable. In the present study, three admixture models with regularization were proposed, which produced sparse solutions through suppressing the noise in the estimated GBC due to genomic similarities. The regularization or penalty forms included the L1 norm penalty, minimax concave penalty (MCP), and smooth clipped absolute deviation (SCAD). The performances of these regularized admixture models on the estimation of GBC were examined in purebred and composite animals, respectively, and compared to that of the non-regularized admixture model as the baseline model. The results showed that, given optimal values for λ, the three sparsely regularized admixture models had higher power and thus reduced the false-negative rate for the breed identification of purebred animals than the non-regularized admixture model. Of the three regularized admixture models, the two with a non-convex penalty outperformed the one with L1 norm penalty. In the Brangus, a composite cattle breed, estimated GBC were roughly comparable among the four admixture models, but all the four models underestimated the GBC for these composite animals when non-ancestral breeds were included as the reference. In conclusion, the admixture models with sparse regularization gave more parsimonious, consistent and interpretable results of estimated GBC for purebred animals than the non-regularized admixture model. Nevertheless, the utility of regularized admixture models for estimating GBC in crossbred or composite animals needs to be taken with caution. Frontiers Media S.A. 2020-06-11 /pmc/articles/PMC7300184/ /pubmed/32595700 http://dx.doi.org/10.3389/fgene.2020.00576 Text en Copyright © 2020 Wang, Wu, Li, Bao, Tait, Bauck and Rosa. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Wang, Yangfan
Wu, Xiao-Lin
Li, Zhi
Bao, Zhenmin
Tait, Richard G.
Bauck, Stewart
Rosa, Guilherme J. M.
Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models
title Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models
title_full Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models
title_fullStr Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models
title_full_unstemmed Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models
title_short Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models
title_sort estimation of genomic breed composition for purebred and crossbred animals using sparsely regularized admixture models
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7300184/
https://www.ncbi.nlm.nih.gov/pubmed/32595700
http://dx.doi.org/10.3389/fgene.2020.00576
work_keys_str_mv AT wangyangfan estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels
AT wuxiaolin estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels
AT lizhi estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels
AT baozhenmin estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels
AT taitrichardg estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels
AT bauckstewart estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels
AT rosaguilhermejm estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels