Cargando…
Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers
BACKGROUND: Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN c...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7654004/ https://www.ncbi.nlm.nih.gov/pubmed/33167865 http://dx.doi.org/10.1186/s12864-020-07181-x |
_version_ | 1783607991403020288 |
---|---|
author | Passafaro, Tiago L. Lopes, Fernando B. Dórea, João R. R. Craven, Mark Breen, Vivian Hawken, Rachel J. Rosa, Guilherme J. M. |
author_facet | Passafaro, Tiago L. Lopes, Fernando B. Dórea, João R. R. Craven, Mark Breen, Vivian Hawken, Rachel J. Rosa, Guilherme J. M. |
author_sort | Passafaro, Tiago L. |
collection | PubMed |
description | BACKGROUND: Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN compared to traditional regression models. Strikingly, no clear superiority of DNN has been reported so far, and results seem highly dependent on the species and traits of application. Nevertheless, the relatively small datasets used in previous studies, most with fewer than 5000 observations may have precluded the full potential of DNN. Therefore, the objective of this study was to investigate the impact of the dataset sample size on the performance of DNN compared to Bayesian regression models for genome-enable prediction of body weight in broilers by sub-sampling 63,526 observations of the training set. RESULTS: Predictive performance of DNN improved as sample size increased, reaching a plateau at about 0.32 of prediction correlation when 60% of the entire training set size was used (i.e., 39,510 observations). Interestingly, DNN showed superior prediction correlation using up to 3% of training set, but poorer prediction correlation after that compared to Bayesian Ridge Regression (BRR) and Bayes Cπ. Regardless of the amount of data used to train the predictive machines, DNN displayed the lowest mean square error of prediction compared to all other approaches. The predictive bias was lower for DNN compared to Bayesian models, across all dataset sizes, with estimates close to one with larger sample sizes. CONCLUSIONS: DNN had worse prediction correlation compared to BRR and Bayes Cπ, but improved mean square error of prediction and bias relative to both Bayesian models for genome-enabled prediction of body weight in broilers. Such findings, highlights advantages and disadvantages between predictive approaches depending on the criterion used for comparison. Furthermore, the inclusion of more data per se is not a guarantee for the DNN to outperform the Bayesian regression methods commonly used for genome-enabled prediction. Nonetheless, further analysis is necessary to detect scenarios where DNN can clearly outperform Bayesian benchmark models. |
format | Online Article Text |
id | pubmed-7654004 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-76540042020-11-10 Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers Passafaro, Tiago L. Lopes, Fernando B. Dórea, João R. R. Craven, Mark Breen, Vivian Hawken, Rachel J. Rosa, Guilherme J. M. BMC Genomics Research Article BACKGROUND: Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN compared to traditional regression models. Strikingly, no clear superiority of DNN has been reported so far, and results seem highly dependent on the species and traits of application. Nevertheless, the relatively small datasets used in previous studies, most with fewer than 5000 observations may have precluded the full potential of DNN. Therefore, the objective of this study was to investigate the impact of the dataset sample size on the performance of DNN compared to Bayesian regression models for genome-enable prediction of body weight in broilers by sub-sampling 63,526 observations of the training set. RESULTS: Predictive performance of DNN improved as sample size increased, reaching a plateau at about 0.32 of prediction correlation when 60% of the entire training set size was used (i.e., 39,510 observations). Interestingly, DNN showed superior prediction correlation using up to 3% of training set, but poorer prediction correlation after that compared to Bayesian Ridge Regression (BRR) and Bayes Cπ. Regardless of the amount of data used to train the predictive machines, DNN displayed the lowest mean square error of prediction compared to all other approaches. The predictive bias was lower for DNN compared to Bayesian models, across all dataset sizes, with estimates close to one with larger sample sizes. CONCLUSIONS: DNN had worse prediction correlation compared to BRR and Bayes Cπ, but improved mean square error of prediction and bias relative to both Bayesian models for genome-enabled prediction of body weight in broilers. Such findings, highlights advantages and disadvantages between predictive approaches depending on the criterion used for comparison. Furthermore, the inclusion of more data per se is not a guarantee for the DNN to outperform the Bayesian regression methods commonly used for genome-enabled prediction. Nonetheless, further analysis is necessary to detect scenarios where DNN can clearly outperform Bayesian benchmark models. BioMed Central 2020-11-09 /pmc/articles/PMC7654004/ /pubmed/33167865 http://dx.doi.org/10.1186/s12864-020-07181-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Passafaro, Tiago L. Lopes, Fernando B. Dórea, João R. R. Craven, Mark Breen, Vivian Hawken, Rachel J. Rosa, Guilherme J. M. Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers |
title | Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers |
title_full | Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers |
title_fullStr | Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers |
title_full_unstemmed | Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers |
title_short | Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers |
title_sort | would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? the case for body weight in broilers |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7654004/ https://www.ncbi.nlm.nih.gov/pubmed/33167865 http://dx.doi.org/10.1186/s12864-020-07181-x |
work_keys_str_mv | AT passafarotiagol wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers AT lopesfernandob wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers AT doreajoaorr wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers AT cravenmark wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers AT breenvivian wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers AT hawkenrachelj wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers AT rosaguilhermejm wouldlargedatasetsamplesizeunveilthepotentialofdeepneuralnetworksforimprovedgenomeenabledpredictionofcomplextraitsthecaseforbodyweightinbroilers |