Cargando…

Sparse Convolutional Neural Networks for Genome-Wide Prediction

Genome-wide prediction (GWP) has become the state-of-the art method in artificial selection. Data sets often comprise number of genomic markers and individuals in ranges from a few thousands to millions. Hence, computational efficiency is important and various machine learning methods have successfu...

Descripción completa

Detalles Bibliográficos
Autores principales:	Waldmann, Patrik, Pfeiffer, Christina, Mészáros, Gábor
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2020
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7029737/ https://www.ncbi.nlm.nih.gov/pubmed/32117441 http://dx.doi.org/10.3389/fgene.2020.00025

_version_	1783499229334863872
author	Waldmann, Patrik Pfeiffer, Christina Mészáros, Gábor
author_facet	Waldmann, Patrik Pfeiffer, Christina Mészáros, Gábor
author_sort	Waldmann, Patrik
collection	PubMed
description	Genome-wide prediction (GWP) has become the state-of-the art method in artificial selection. Data sets often comprise number of genomic markers and individuals in ranges from a few thousands to millions. Hence, computational efficiency is important and various machine learning methods have successfully been used in GWP. Neural networks (NN) and deep learning (DL) are very flexible methods that usually show outstanding prediction properties on complex structured data, but their use in GWP is nevertheless rare and debated. This study describes a powerful NN method for genomic marker data that can easily be extended. It is shown that a one-dimensional convolutional neural network (CNN) can be used to incorporate the ordinal information between markers and, together with pooling and ℓ (1)-norm regularization, provides a sparse and computationally efficient approach for GWP. The method, denoted CNNGWP, is implemented in the deep learning software Keras, and hyper-parameters of the NN are tuned with Bayesian optimization. Model averaged ensemble predictions further reduce prediction error. Evaluations show that CNNGWP improves prediction error by more than 25% on simulated data and around 3% on real pig data compared with results obtained with GBLUP and the LASSO. In conclusion, the CNNGWP provides a promising approach for GWP, but the magnitude of improvement depends on the genetic architecture and the heritability.
format	Online Article Text
id	pubmed-7029737
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-70297372020-02-28 Sparse Convolutional Neural Networks for Genome-Wide Prediction Waldmann, Patrik Pfeiffer, Christina Mészáros, Gábor Front Genet Genetics Genome-wide prediction (GWP) has become the state-of-the art method in artificial selection. Data sets often comprise number of genomic markers and individuals in ranges from a few thousands to millions. Hence, computational efficiency is important and various machine learning methods have successfully been used in GWP. Neural networks (NN) and deep learning (DL) are very flexible methods that usually show outstanding prediction properties on complex structured data, but their use in GWP is nevertheless rare and debated. This study describes a powerful NN method for genomic marker data that can easily be extended. It is shown that a one-dimensional convolutional neural network (CNN) can be used to incorporate the ordinal information between markers and, together with pooling and ℓ (1)-norm regularization, provides a sparse and computationally efficient approach for GWP. The method, denoted CNNGWP, is implemented in the deep learning software Keras, and hyper-parameters of the NN are tuned with Bayesian optimization. Model averaged ensemble predictions further reduce prediction error. Evaluations show that CNNGWP improves prediction error by more than 25% on simulated data and around 3% on real pig data compared with results obtained with GBLUP and the LASSO. In conclusion, the CNNGWP provides a promising approach for GWP, but the magnitude of improvement depends on the genetic architecture and the heritability. Frontiers Media S.A. 2020-02-06 /pmc/articles/PMC7029737/ /pubmed/32117441 http://dx.doi.org/10.3389/fgene.2020.00025 Text en Copyright © 2020 Waldmann, Pfeiffer and Mészáros http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Genetics Waldmann, Patrik Pfeiffer, Christina Mészáros, Gábor Sparse Convolutional Neural Networks for Genome-Wide Prediction
title	Sparse Convolutional Neural Networks for Genome-Wide Prediction
title_full	Sparse Convolutional Neural Networks for Genome-Wide Prediction
title_fullStr	Sparse Convolutional Neural Networks for Genome-Wide Prediction
title_full_unstemmed	Sparse Convolutional Neural Networks for Genome-Wide Prediction
title_short	Sparse Convolutional Neural Networks for Genome-Wide Prediction
title_sort	sparse convolutional neural networks for genome-wide prediction
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7029737/ https://www.ncbi.nlm.nih.gov/pubmed/32117441 http://dx.doi.org/10.3389/fgene.2020.00025
work_keys_str_mv	AT waldmannpatrik sparseconvolutionalneuralnetworksforgenomewideprediction AT pfeifferchristina sparseconvolutionalneuralnetworksforgenomewideprediction AT meszarosgabor sparseconvolutionalneuralnetworksforgenomewideprediction

Sparse Convolutional Neural Networks for Genome-Wide Prediction

Ejemplares similares