Cargando…

Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers

BACKGROUND: Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN c...

Descripción completa

Detalles Bibliográficos
Autores principales: Passafaro, Tiago L., Lopes, Fernando B., Dórea, João R. R., Craven, Mark, Breen, Vivian, Hawken, Rachel J., Rosa, Guilherme J. M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7654004/
https://www.ncbi.nlm.nih.gov/pubmed/33167865
http://dx.doi.org/10.1186/s12864-020-07181-x
_version_ 1783607991403020288
author Passafaro, Tiago L.
Lopes, Fernando B.
Dórea, João R. R.
Craven, Mark
Breen, Vivian
Hawken, Rachel J.
Rosa, Guilherme J. M.
author_facet Passafaro, Tiago L.
Lopes, Fernando B.
Dórea, João R. R.
Craven, Mark
Breen, Vivian
Hawken, Rachel J.
Rosa, Guilherme J. M.
author_sort Passafaro, Tiago L.
collection PubMed
description BACKGROUND: Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN compared to traditional regression models. Strikingly, no clear superiority of DNN has been reported so far, and results seem highly dependent on the species and traits of application. Nevertheless, the relatively small datasets used in previous studies, most with fewer than 5000 observations may have precluded the full potential of DNN. Therefore, the objective of this study was to investigate the impact of the dataset sample size on the performance of DNN compared to Bayesian regression models for genome-enable prediction of body weight in broilers by sub-sampling 63,526 observations of the training set. RESULTS: Predictive performance of DNN improved as sample size increased, reaching a plateau at about 0.32 of prediction correlation when 60% of the entire training set size was used (i.e., 39,510 observations). Interestingly, DNN showed superior prediction correlation using up to 3% of training set, but poorer prediction correlation after that compared to Bayesian Ridge Regression (BRR) and Bayes Cπ. Regardless of the amount of data used to train the predictive machines, DNN displayed the lowest mean square error of prediction compared to all other approaches. The predictive bias was lower for DNN compared to Bayesian models, across all dataset sizes, with estimates close to one with larger sample sizes. CONCLUSIONS: DNN had worse prediction correlation compared to BRR and Bayes Cπ, but improved mean square error of prediction and bias relative to both Bayesian models for genome-enabled prediction of body weight in broilers. Such findings, highlights advantages and disadvantages between predictive approaches depending on the criterion used for comparison. Furthermore, the inclusion of more data per se is not a guarantee for the DNN to outperform the Bayesian regression methods commonly used for genome-enabled prediction. Nonetheless, further analysis is necessary to detect scenarios where DNN can clearly outperform Bayesian benchmark models.
format Online
Article
Text
id pubmed-7654004
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-76540042020-11-10 Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers Passafaro, Tiago L. Lopes, Fernando B. Dórea, João R. R. Craven, Mark Breen, Vivian Hawken, Rachel J. Rosa, Guilherme J. M. BMC Genomics Research Article BACKGROUND: Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN compared to traditional regression models. Strikingly, no clear superiority of DNN has been reported so far, and results seem highly dependent on the species and traits of application. Nevertheless, the relatively small datasets used in previous studies, most with fewer than 5000 observations may have precluded the full potential of DNN. Therefore, the objective of this study was to investigate the impact of the dataset sample size on the performance of DNN compared to Bayesian regression models for genome-enable prediction of body weight in broilers by sub-sampling 63,526 observations of the training set. RESULTS: Predictive performance of DNN improved as sample size increased, reaching a plateau at about 0.32 of prediction correlation when 60% of the entire training set size was used (i.e., 39,510 observations). Interestingly, DNN showed superior prediction correlation using up to 3% of training set, but poorer prediction correlation after that compared to Bayesian Ridge Regression (BRR) and Bayes Cπ. Regardless of the amount of data used to train the predictive machines, DNN displayed the lowest mean square error of prediction compared to all other approaches. The predictive bias was lower for DNN compared to Bayesian models, across all dataset sizes, with estimates close to one with larger sample sizes. CONCLUSIONS: DNN had worse prediction correlation compared to BRR and Bayes Cπ, but improved mean square error of prediction and bias relative to both Bayesian models for genome-enabled prediction of body weight in broilers. Such findings, highlights advantages and disadvantages between predictive approaches depending on the criterion used for comparison. Furthermore, the inclusion of more data per se is not a guarantee for the DNN to outperform the Bayesian regression methods commonly used for genome-enabled prediction. Nonetheless, further analysis is necessary to detect scenarios where DNN can clearly outperform Bayesian benchmark models. BioMed Central 2020-11-09 /pmc/articles/PMC7654004/ /pubmed/33167865 http://dx.doi.org/10.1186/s12864-020-07181-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Passafaro, Tiago L.
Lopes, Fernando B.
Dórea, João R. R.
Craven, Mark
Breen, Vivian
Hawken, Rachel J.
Rosa, Guilherme J. M.
Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers
title Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers
title_full Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers
title_fullStr Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers
title_full_unstemmed Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers
title_short Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers
title_sort would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? the case for body weight in broilers
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7654004/
https://www.ncbi.nlm.nih.gov/pubmed/33167865
http://dx.doi.org/10.1186/s12864-020-07181-x
work_keys_str_mv AT passafarotiagol wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers
AT lopesfernandob wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers
AT doreajoaorr wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers
AT cravenmark wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers
AT breenvivian wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers
AT hawkenrachelj wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers
AT rosaguilhermejm wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers