Cargando…

Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies

In conventional linear models for whole-genome prediction and genome-wide association studies (GWAS), it is usually assumed that the relationship between genotypes and phenotypes is linear. Bayesian neural networks have been used to account for non-linearity such as complex genetic architectures. He...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhao, Tianjing, Fernando, Rohan, Cheng, Hao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Investigation
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496266/ https://www.ncbi.nlm.nih.gov/pubmed/34499126 http://dx.doi.org/10.1093/g3journal/jkab228

_version_	1784579719929266176
author	Zhao, Tianjing Fernando, Rohan Cheng, Hao
author_facet	Zhao, Tianjing Fernando, Rohan Cheng, Hao
author_sort	Zhao, Tianjing
collection	PubMed
description	In conventional linear models for whole-genome prediction and genome-wide association studies (GWAS), it is usually assumed that the relationship between genotypes and phenotypes is linear. Bayesian neural networks have been used to account for non-linearity such as complex genetic architectures. Here, we introduce a method named NN-Bayes, where “NN” stands for neural networks, and “Bayes” stands for Bayesian Alphabet models, including a collection of Bayesian regression models such as BayesA, BayesB, BayesC, and Bayesian LASSO. NN-Bayes incorporates Bayesian Alphabet models into non-linear neural networks via hidden layers between single-nucleotide polymorphisms (SNPs) and observed traits. Thus, NN-Bayes attempts to improve the performance of genome-wide prediction and GWAS by accommodating non-linear relationships between the hidden nodes and the observed trait, while maintaining genomic interpretability through the Bayesian regression models that connect the SNPs to the hidden nodes. For genomic interpretability, the posterior distribution of marker effects in NN-Bayes is inferred by Markov chain Monte Carlo approaches and used for inference of association through posterior inclusion probabilities and window posterior probability of association. In simulation studies with dominance and epistatic effects, performance of NN-Bayes was significantly better than conventional linear models for both GWAS and whole-genome prediction, and the differences on prediction accuracy were substantial in magnitude. In real-data analyses, for the soy dataset, NN-Bayes achieved significantly higher prediction accuracies than conventional linear models, and results from other four different species showed that NN-Bayes had similar prediction performance to linear models, which is potentially due to the small sample size. Our NN-Bayes is optimized for high-dimensional genomic data and implemented in an open-source package called “JWAS.” NN-Bayes can lead to greater use of Bayesian neural networks to account for non-linear relationships due to its interpretability and computational performance.
format	Online Article Text
id	pubmed-8496266
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-84962662021-10-07 Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies Zhao, Tianjing Fernando, Rohan Cheng, Hao G3 (Bethesda) Investigation In conventional linear models for whole-genome prediction and genome-wide association studies (GWAS), it is usually assumed that the relationship between genotypes and phenotypes is linear. Bayesian neural networks have been used to account for non-linearity such as complex genetic architectures. Here, we introduce a method named NN-Bayes, where “NN” stands for neural networks, and “Bayes” stands for Bayesian Alphabet models, including a collection of Bayesian regression models such as BayesA, BayesB, BayesC, and Bayesian LASSO. NN-Bayes incorporates Bayesian Alphabet models into non-linear neural networks via hidden layers between single-nucleotide polymorphisms (SNPs) and observed traits. Thus, NN-Bayes attempts to improve the performance of genome-wide prediction and GWAS by accommodating non-linear relationships between the hidden nodes and the observed trait, while maintaining genomic interpretability through the Bayesian regression models that connect the SNPs to the hidden nodes. For genomic interpretability, the posterior distribution of marker effects in NN-Bayes is inferred by Markov chain Monte Carlo approaches and used for inference of association through posterior inclusion probabilities and window posterior probability of association. In simulation studies with dominance and epistatic effects, performance of NN-Bayes was significantly better than conventional linear models for both GWAS and whole-genome prediction, and the differences on prediction accuracy were substantial in magnitude. In real-data analyses, for the soy dataset, NN-Bayes achieved significantly higher prediction accuracies than conventional linear models, and results from other four different species showed that NN-Bayes had similar prediction performance to linear models, which is potentially due to the small sample size. Our NN-Bayes is optimized for high-dimensional genomic data and implemented in an open-source package called “JWAS.” NN-Bayes can lead to greater use of Bayesian neural networks to account for non-linear relationships due to its interpretability and computational performance. Oxford University Press 2021-07-10 /pmc/articles/PMC8496266/ /pubmed/34499126 http://dx.doi.org/10.1093/g3journal/jkab228 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Investigation Zhao, Tianjing Fernando, Rohan Cheng, Hao Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies
title	Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies
title_full	Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies
title_fullStr	Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies
title_full_unstemmed	Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies
title_short	Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies
title_sort	interpretable artificial neural networks incorporating bayesian alphabet models for genome-wide prediction and association studies
topic	Investigation
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496266/ https://www.ncbi.nlm.nih.gov/pubmed/34499126 http://dx.doi.org/10.1093/g3journal/jkab228
work_keys_str_mv	AT zhaotianjing interpretableartificialneuralnetworksincorporatingbayesianalphabetmodelsforgenomewidepredictionandassociationstudies AT fernandorohan interpretableartificialneuralnetworksincorporatingbayesianalphabetmodelsforgenomewidepredictionandassociationstudies AT chenghao interpretableartificialneuralnetworksincorporatingbayesianalphabetmodelsforgenomewidepredictionandassociationstudies

Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies

Ejemplares similares