Cargando…
Approximate Bayesian neural networks in genomic prediction
BACKGROUND: Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few thousands...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6303864/ https://www.ncbi.nlm.nih.gov/pubmed/30577737 http://dx.doi.org/10.1186/s12711-018-0439-1 |
_version_ | 1783382244022288384 |
---|---|
author | Waldmann, Patrik |
author_facet | Waldmann, Patrik |
author_sort | Waldmann, Patrik |
collection | PubMed |
description | BACKGROUND: Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few thousands individuals. Different machine-learning approaches have been used in GWAS and GWP effectively, but the use of neural networks (NN) and deep-learning is still scarce. This study presents a NN model for genomic SNP data. RESULTS: We show, using both simulated and real pig data, that regularization is obtained using weight decay and dropout, and results in an approximate Bayesian (ABNN) model that can be used to obtain model averaged posterior predictions. The ABNN model is implemented in mxnet and shown to yield better prediction accuracy than genomic best linear unbiased prediction and Bayesian LASSO. The mean squared error was reduced by at least 6.5% in the simulated data and by at least 1% in the real data. Moreover, by comparing NN of different complexities, our results confirm that a shallow model with one layer, one neuron, one-hot encoding and a linear activation function performs better than more complex models. CONCLUSIONS: The ABNN model provides a computationally efficient approach with good prediction performance and in which the weight components can also provide information on the importance of the SNPs. Hence, ABNN is suitable for both GWP and GWAS. |
format | Online Article Text |
id | pubmed-6303864 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63038642018-12-31 Approximate Bayesian neural networks in genomic prediction Waldmann, Patrik Genet Sel Evol Research Article BACKGROUND: Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few thousands individuals. Different machine-learning approaches have been used in GWAS and GWP effectively, but the use of neural networks (NN) and deep-learning is still scarce. This study presents a NN model for genomic SNP data. RESULTS: We show, using both simulated and real pig data, that regularization is obtained using weight decay and dropout, and results in an approximate Bayesian (ABNN) model that can be used to obtain model averaged posterior predictions. The ABNN model is implemented in mxnet and shown to yield better prediction accuracy than genomic best linear unbiased prediction and Bayesian LASSO. The mean squared error was reduced by at least 6.5% in the simulated data and by at least 1% in the real data. Moreover, by comparing NN of different complexities, our results confirm that a shallow model with one layer, one neuron, one-hot encoding and a linear activation function performs better than more complex models. CONCLUSIONS: The ABNN model provides a computationally efficient approach with good prediction performance and in which the weight components can also provide information on the importance of the SNPs. Hence, ABNN is suitable for both GWP and GWAS. BioMed Central 2018-12-22 /pmc/articles/PMC6303864/ /pubmed/30577737 http://dx.doi.org/10.1186/s12711-018-0439-1 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Waldmann, Patrik Approximate Bayesian neural networks in genomic prediction |
title | Approximate Bayesian neural networks in genomic prediction |
title_full | Approximate Bayesian neural networks in genomic prediction |
title_fullStr | Approximate Bayesian neural networks in genomic prediction |
title_full_unstemmed | Approximate Bayesian neural networks in genomic prediction |
title_short | Approximate Bayesian neural networks in genomic prediction |
title_sort | approximate bayesian neural networks in genomic prediction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6303864/ https://www.ncbi.nlm.nih.gov/pubmed/30577737 http://dx.doi.org/10.1186/s12711-018-0439-1 |
work_keys_str_mv | AT waldmannpatrik approximatebayesianneuralnetworksingenomicprediction |