Cargando…
Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models
A variety of statistical methods, such as admixture models, have been used to estimate genomic breed composition (GBC). These methods, however, tend to produce non-zero components to reference breeds that shared some genomic similarity with a test animal. These non-essential GBC components, in turn,...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7300184/ https://www.ncbi.nlm.nih.gov/pubmed/32595700 http://dx.doi.org/10.3389/fgene.2020.00576 |
_version_ | 1783547535034417152 |
---|---|
author | Wang, Yangfan Wu, Xiao-Lin Li, Zhi Bao, Zhenmin Tait, Richard G. Bauck, Stewart Rosa, Guilherme J. M. |
author_facet | Wang, Yangfan Wu, Xiao-Lin Li, Zhi Bao, Zhenmin Tait, Richard G. Bauck, Stewart Rosa, Guilherme J. M. |
author_sort | Wang, Yangfan |
collection | PubMed |
description | A variety of statistical methods, such as admixture models, have been used to estimate genomic breed composition (GBC). These methods, however, tend to produce non-zero components to reference breeds that shared some genomic similarity with a test animal. These non-essential GBC components, in turn, offset the estimated GBC for the breed to which it belongs. As a result, not all purebred animals have 100% GBC of their respective breeds, which statistically indicates an elevated false-negative rate in the identification of purebred animals with 100% GBC as the cutoff. Otherwise, a lower cutoff of estimated GBC will have to be used, which is arbitrary, and the results are less interpretable. In the present study, three admixture models with regularization were proposed, which produced sparse solutions through suppressing the noise in the estimated GBC due to genomic similarities. The regularization or penalty forms included the L1 norm penalty, minimax concave penalty (MCP), and smooth clipped absolute deviation (SCAD). The performances of these regularized admixture models on the estimation of GBC were examined in purebred and composite animals, respectively, and compared to that of the non-regularized admixture model as the baseline model. The results showed that, given optimal values for λ, the three sparsely regularized admixture models had higher power and thus reduced the false-negative rate for the breed identification of purebred animals than the non-regularized admixture model. Of the three regularized admixture models, the two with a non-convex penalty outperformed the one with L1 norm penalty. In the Brangus, a composite cattle breed, estimated GBC were roughly comparable among the four admixture models, but all the four models underestimated the GBC for these composite animals when non-ancestral breeds were included as the reference. In conclusion, the admixture models with sparse regularization gave more parsimonious, consistent and interpretable results of estimated GBC for purebred animals than the non-regularized admixture model. Nevertheless, the utility of regularized admixture models for estimating GBC in crossbred or composite animals needs to be taken with caution. |
format | Online Article Text |
id | pubmed-7300184 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-73001842020-06-26 Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models Wang, Yangfan Wu, Xiao-Lin Li, Zhi Bao, Zhenmin Tait, Richard G. Bauck, Stewart Rosa, Guilherme J. M. Front Genet Genetics A variety of statistical methods, such as admixture models, have been used to estimate genomic breed composition (GBC). These methods, however, tend to produce non-zero components to reference breeds that shared some genomic similarity with a test animal. These non-essential GBC components, in turn, offset the estimated GBC for the breed to which it belongs. As a result, not all purebred animals have 100% GBC of their respective breeds, which statistically indicates an elevated false-negative rate in the identification of purebred animals with 100% GBC as the cutoff. Otherwise, a lower cutoff of estimated GBC will have to be used, which is arbitrary, and the results are less interpretable. In the present study, three admixture models with regularization were proposed, which produced sparse solutions through suppressing the noise in the estimated GBC due to genomic similarities. The regularization or penalty forms included the L1 norm penalty, minimax concave penalty (MCP), and smooth clipped absolute deviation (SCAD). The performances of these regularized admixture models on the estimation of GBC were examined in purebred and composite animals, respectively, and compared to that of the non-regularized admixture model as the baseline model. The results showed that, given optimal values for λ, the three sparsely regularized admixture models had higher power and thus reduced the false-negative rate for the breed identification of purebred animals than the non-regularized admixture model. Of the three regularized admixture models, the two with a non-convex penalty outperformed the one with L1 norm penalty. In the Brangus, a composite cattle breed, estimated GBC were roughly comparable among the four admixture models, but all the four models underestimated the GBC for these composite animals when non-ancestral breeds were included as the reference. In conclusion, the admixture models with sparse regularization gave more parsimonious, consistent and interpretable results of estimated GBC for purebred animals than the non-regularized admixture model. Nevertheless, the utility of regularized admixture models for estimating GBC in crossbred or composite animals needs to be taken with caution. Frontiers Media S.A. 2020-06-11 /pmc/articles/PMC7300184/ /pubmed/32595700 http://dx.doi.org/10.3389/fgene.2020.00576 Text en Copyright © 2020 Wang, Wu, Li, Bao, Tait, Bauck and Rosa. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Wang, Yangfan Wu, Xiao-Lin Li, Zhi Bao, Zhenmin Tait, Richard G. Bauck, Stewart Rosa, Guilherme J. M. Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models |
title | Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models |
title_full | Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models |
title_fullStr | Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models |
title_full_unstemmed | Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models |
title_short | Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models |
title_sort | estimation of genomic breed composition for purebred and crossbred animals using sparsely regularized admixture models |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7300184/ https://www.ncbi.nlm.nih.gov/pubmed/32595700 http://dx.doi.org/10.3389/fgene.2020.00576 |
work_keys_str_mv | AT wangyangfan estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels AT wuxiaolin estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels AT lizhi estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels AT baozhenmin estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels AT taitrichardg estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels AT bauckstewart estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels AT rosaguilhermejm estimationofgenomicbreedcompositionforpurebredandcrossbredanimalsusingsparselyregularizedadmixturemodels |