Cargando…

Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle

BACKGROUND: Over the past decade, Fourier transform infrared (FTIR) spectroscopy has been used to predict novel milk protein phenotypes. Genomic data might help predict these phenotypes when integrated with milk FTIR spectra. The objective of this study was to investigate prediction accuracy for mil...

Descripción completa

Detalles Bibliográficos
Autores principales: Baba, Toshimi, Pegolo, Sara, Mota, Lucio F. M., Peñagaricano, Francisco, Bittante, Giovanni, Cecchinato, Alessio, Morota, Gota
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7968271/
https://www.ncbi.nlm.nih.gov/pubmed/33726672
http://dx.doi.org/10.1186/s12711-021-00620-7
_version_ 1783666030275461120
author Baba, Toshimi
Pegolo, Sara
Mota, Lucio F. M.
Peñagaricano, Francisco
Bittante, Giovanni
Cecchinato, Alessio
Morota, Gota
author_facet Baba, Toshimi
Pegolo, Sara
Mota, Lucio F. M.
Peñagaricano, Francisco
Bittante, Giovanni
Cecchinato, Alessio
Morota, Gota
author_sort Baba, Toshimi
collection PubMed
description BACKGROUND: Over the past decade, Fourier transform infrared (FTIR) spectroscopy has been used to predict novel milk protein phenotypes. Genomic data might help predict these phenotypes when integrated with milk FTIR spectra. The objective of this study was to investigate prediction accuracy for milk protein phenotypes when heterogeneous on-farm, genomic, and pedigree data were integrated with the spectra. To this end, we used the records of 966 Italian Brown Swiss cows with milk FTIR spectra, on-farm information, medium-density genetic markers, and pedigree data. True and total whey protein, and five casein, and two whey protein traits were analyzed. Multiple kernel learning constructed from spectral and genomic (pedigree) relationship matrices and multilayer BayesB assigning separate priors for FTIR and markers were benchmarked against a baseline partial least squares (PLS) regression. Seven combinations of covariates were considered, and their predictive abilities were evaluated by repeated random sub-sampling and herd cross-validations (CV). RESULTS: Addition of the on-farm effects such as herd, days in milk, and parity to spectral data improved predictions as compared to those obtained using the spectra alone. Integrating genomics and/or the top three markers with a large effect further enhanced the predictions. Pedigree data also improved prediction, but to a lesser extent than genomic data. Multiple kernel learning and multilayer BayesB increased predictive performance, whereas PLS did not. Overall, multilayer BayesB provided better predictions than multiple kernel learning, and lower prediction performance was observed in herd CV compared to repeated random sub-sampling CV. CONCLUSIONS: Integration of genomic information with milk FTIR spectral can enhance milk protein trait predictions by 25% and 7% on average for repeated random sub-sampling and herd CV, respectively. Multiple kernel learning and multilayer BayesB outperformed PLS when used to integrate heterogeneous data for phenotypic predictions.
format Online
Article
Text
id pubmed-7968271
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-79682712021-03-22 Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle Baba, Toshimi Pegolo, Sara Mota, Lucio F. M. Peñagaricano, Francisco Bittante, Giovanni Cecchinato, Alessio Morota, Gota Genet Sel Evol Research Article BACKGROUND: Over the past decade, Fourier transform infrared (FTIR) spectroscopy has been used to predict novel milk protein phenotypes. Genomic data might help predict these phenotypes when integrated with milk FTIR spectra. The objective of this study was to investigate prediction accuracy for milk protein phenotypes when heterogeneous on-farm, genomic, and pedigree data were integrated with the spectra. To this end, we used the records of 966 Italian Brown Swiss cows with milk FTIR spectra, on-farm information, medium-density genetic markers, and pedigree data. True and total whey protein, and five casein, and two whey protein traits were analyzed. Multiple kernel learning constructed from spectral and genomic (pedigree) relationship matrices and multilayer BayesB assigning separate priors for FTIR and markers were benchmarked against a baseline partial least squares (PLS) regression. Seven combinations of covariates were considered, and their predictive abilities were evaluated by repeated random sub-sampling and herd cross-validations (CV). RESULTS: Addition of the on-farm effects such as herd, days in milk, and parity to spectral data improved predictions as compared to those obtained using the spectra alone. Integrating genomics and/or the top three markers with a large effect further enhanced the predictions. Pedigree data also improved prediction, but to a lesser extent than genomic data. Multiple kernel learning and multilayer BayesB increased predictive performance, whereas PLS did not. Overall, multilayer BayesB provided better predictions than multiple kernel learning, and lower prediction performance was observed in herd CV compared to repeated random sub-sampling CV. CONCLUSIONS: Integration of genomic information with milk FTIR spectral can enhance milk protein trait predictions by 25% and 7% on average for repeated random sub-sampling and herd CV, respectively. Multiple kernel learning and multilayer BayesB outperformed PLS when used to integrate heterogeneous data for phenotypic predictions. BioMed Central 2021-03-16 /pmc/articles/PMC7968271/ /pubmed/33726672 http://dx.doi.org/10.1186/s12711-021-00620-7 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Baba, Toshimi
Pegolo, Sara
Mota, Lucio F. M.
Peñagaricano, Francisco
Bittante, Giovanni
Cecchinato, Alessio
Morota, Gota
Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle
title Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle
title_full Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle
title_fullStr Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle
title_full_unstemmed Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle
title_short Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle
title_sort integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7968271/
https://www.ncbi.nlm.nih.gov/pubmed/33726672
http://dx.doi.org/10.1186/s12711-021-00620-7
work_keys_str_mv AT babatoshimi integratinggenomicandinfraredspectraldataimprovesthepredictionofmilkproteincompositionindairycattle
AT pegolosara integratinggenomicandinfraredspectraldataimprovesthepredictionofmilkproteincompositionindairycattle
AT motaluciofm integratinggenomicandinfraredspectraldataimprovesthepredictionofmilkproteincompositionindairycattle
AT penagaricanofrancisco integratinggenomicandinfraredspectraldataimprovesthepredictionofmilkproteincompositionindairycattle
AT bittantegiovanni integratinggenomicandinfraredspectraldataimprovesthepredictionofmilkproteincompositionindairycattle
AT cecchinatoalessio integratinggenomicandinfraredspectraldataimprovesthepredictionofmilkproteincompositionindairycattle
AT morotagota integratinggenomicandinfraredspectraldataimprovesthepredictionofmilkproteincompositionindairycattle