Cargando…

Approximate Bayesian neural networks in genomic prediction

BACKGROUND: Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few thousands...

Descripción completa

Detalles Bibliográficos
Autor principal: Waldmann, Patrik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6303864/
https://www.ncbi.nlm.nih.gov/pubmed/30577737
http://dx.doi.org/10.1186/s12711-018-0439-1
_version_ 1783382244022288384
author Waldmann, Patrik
author_facet Waldmann, Patrik
author_sort Waldmann, Patrik
collection PubMed
description BACKGROUND: Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few thousands individuals. Different machine-learning approaches have been used in GWAS and GWP effectively, but the use of neural networks (NN) and deep-learning is still scarce. This study presents a NN model for genomic SNP data. RESULTS: We show, using both simulated and real pig data, that regularization is obtained using weight decay and dropout, and results in an approximate Bayesian (ABNN) model that can be used to obtain model averaged posterior predictions. The ABNN model is implemented in mxnet and shown to yield better prediction accuracy than genomic best linear unbiased prediction and Bayesian LASSO. The mean squared error was reduced by at least 6.5% in the simulated data and by at least 1% in the real data. Moreover, by comparing NN of different complexities, our results confirm that a shallow model with one layer, one neuron, one-hot encoding and a linear activation function performs better than more complex models. CONCLUSIONS: The ABNN model provides a computationally efficient approach with good prediction performance and in which the weight components can also provide information on the importance of the SNPs. Hence, ABNN is suitable for both GWP and GWAS.
format Online
Article
Text
id pubmed-6303864
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63038642018-12-31 Approximate Bayesian neural networks in genomic prediction Waldmann, Patrik Genet Sel Evol Research Article BACKGROUND: Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few thousands individuals. Different machine-learning approaches have been used in GWAS and GWP effectively, but the use of neural networks (NN) and deep-learning is still scarce. This study presents a NN model for genomic SNP data. RESULTS: We show, using both simulated and real pig data, that regularization is obtained using weight decay and dropout, and results in an approximate Bayesian (ABNN) model that can be used to obtain model averaged posterior predictions. The ABNN model is implemented in mxnet and shown to yield better prediction accuracy than genomic best linear unbiased prediction and Bayesian LASSO. The mean squared error was reduced by at least 6.5% in the simulated data and by at least 1% in the real data. Moreover, by comparing NN of different complexities, our results confirm that a shallow model with one layer, one neuron, one-hot encoding and a linear activation function performs better than more complex models. CONCLUSIONS: The ABNN model provides a computationally efficient approach with good prediction performance and in which the weight components can also provide information on the importance of the SNPs. Hence, ABNN is suitable for both GWP and GWAS. BioMed Central 2018-12-22 /pmc/articles/PMC6303864/ /pubmed/30577737 http://dx.doi.org/10.1186/s12711-018-0439-1 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Waldmann, Patrik
Approximate Bayesian neural networks in genomic prediction
title Approximate Bayesian neural networks in genomic prediction
title_full Approximate Bayesian neural networks in genomic prediction
title_fullStr Approximate Bayesian neural networks in genomic prediction
title_full_unstemmed Approximate Bayesian neural networks in genomic prediction
title_short Approximate Bayesian neural networks in genomic prediction
title_sort approximate bayesian neural networks in genomic prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6303864/
https://www.ncbi.nlm.nih.gov/pubmed/30577737
http://dx.doi.org/10.1186/s12711-018-0439-1
work_keys_str_mv AT waldmannpatrik approximatebayesianneuralnetworksingenomicprediction