Cargando…

Genome-enabled prediction using probabilistic neural network classifiers

BACKGROUND: Multi-layer perceptron (MLP) and radial basis function neural networks (RBFNN) have been shown to be effective in genome-enabled prediction. Here, we evaluated and compared the classification performance of an MLP classifier versus that of a probabilistic neural network (PNN), to predict...

Descripción completa

Detalles Bibliográficos
Autores principales:	González-Camacho, Juan Manuel, Crossa, José, Pérez-Rodríguez, Paulino, Ornella, Leonardo, Gianola, Daniel
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4784384/ https://www.ncbi.nlm.nih.gov/pubmed/26956885 http://dx.doi.org/10.1186/s12864-016-2553-1

_version_	1782420257807269888
author	González-Camacho, Juan Manuel Crossa, José Pérez-Rodríguez, Paulino Ornella, Leonardo Gianola, Daniel
author_facet	González-Camacho, Juan Manuel Crossa, José Pérez-Rodríguez, Paulino Ornella, Leonardo Gianola, Daniel
author_sort	González-Camacho, Juan Manuel
collection	PubMed
description	BACKGROUND: Multi-layer perceptron (MLP) and radial basis function neural networks (RBFNN) have been shown to be effective in genome-enabled prediction. Here, we evaluated and compared the classification performance of an MLP classifier versus that of a probabilistic neural network (PNN), to predict the probability of membership of one individual in a phenotypic class of interest, using genomic and phenotypic data as input variables. We used 16 maize and 17 wheat genomic and phenotypic datasets with different trait-environment combinations (sample sizes ranged from 290 to 300 individuals) with 1.4 k and 55 k SNP chips. Classifiers were tested using continuous traits that were categorized into three classes (upper, middle and lower) based on the empirical distribution of each trait, constructed on the basis of two percentiles (15–85 % and 30–70 %). We focused on the 15 and 30 % percentiles for the upper and lower classes for selecting the best individuals, as commonly done in genomic selection. Wheat datasets were also used with two classes. The criteria for assessing the predictive accuracy of the two classifiers were the area under the receiver operating characteristic curve (AUC) and the area under the precision-recall curve (AUCpr). Parameters of both classifiers were estimated by optimizing the AUC for a specific class of interest. RESULTS: The AUC and AUCpr criteria provided enough evidence to conclude that PNN was more accurate than MLP for assigning maize and wheat lines to the correct upper, middle or lower class for the complex traits analyzed. Results for the wheat datasets with continuous traits split into two and three classes showed that the performance of PNN with three classes was higher than with two classes when classifying individuals into the upper and lower (15 or 30 %) categories. CONCLUSIONS: The PNN classifier outperformed the MLP classifier in all 33 (maize and wheat) datasets when using AUC and AUCpr for selecting individuals of a specific class. Use of PNN with Gaussian radial basis functions seems promising in genomic selection for identifying the best individuals. Categorizing continuous traits into three classes generally provided better classification than when using two classes, because classification accuracy improved when classes were balanced. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2553-1) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4784384
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-47843842016-03-10 Genome-enabled prediction using probabilistic neural network classifiers González-Camacho, Juan Manuel Crossa, José Pérez-Rodríguez, Paulino Ornella, Leonardo Gianola, Daniel BMC Genomics Research Article BACKGROUND: Multi-layer perceptron (MLP) and radial basis function neural networks (RBFNN) have been shown to be effective in genome-enabled prediction. Here, we evaluated and compared the classification performance of an MLP classifier versus that of a probabilistic neural network (PNN), to predict the probability of membership of one individual in a phenotypic class of interest, using genomic and phenotypic data as input variables. We used 16 maize and 17 wheat genomic and phenotypic datasets with different trait-environment combinations (sample sizes ranged from 290 to 300 individuals) with 1.4 k and 55 k SNP chips. Classifiers were tested using continuous traits that were categorized into three classes (upper, middle and lower) based on the empirical distribution of each trait, constructed on the basis of two percentiles (15–85 % and 30–70 %). We focused on the 15 and 30 % percentiles for the upper and lower classes for selecting the best individuals, as commonly done in genomic selection. Wheat datasets were also used with two classes. The criteria for assessing the predictive accuracy of the two classifiers were the area under the receiver operating characteristic curve (AUC) and the area under the precision-recall curve (AUCpr). Parameters of both classifiers were estimated by optimizing the AUC for a specific class of interest. RESULTS: The AUC and AUCpr criteria provided enough evidence to conclude that PNN was more accurate than MLP for assigning maize and wheat lines to the correct upper, middle or lower class for the complex traits analyzed. Results for the wheat datasets with continuous traits split into two and three classes showed that the performance of PNN with three classes was higher than with two classes when classifying individuals into the upper and lower (15 or 30 %) categories. CONCLUSIONS: The PNN classifier outperformed the MLP classifier in all 33 (maize and wheat) datasets when using AUC and AUCpr for selecting individuals of a specific class. Use of PNN with Gaussian radial basis functions seems promising in genomic selection for identifying the best individuals. Categorizing continuous traits into three classes generally provided better classification than when using two classes, because classification accuracy improved when classes were balanced. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2553-1) contains supplementary material, which is available to authorized users. BioMed Central 2016-03-09 /pmc/articles/PMC4784384/ /pubmed/26956885 http://dx.doi.org/10.1186/s12864-016-2553-1 Text en © González-Camacho et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article González-Camacho, Juan Manuel Crossa, José Pérez-Rodríguez, Paulino Ornella, Leonardo Gianola, Daniel Genome-enabled prediction using probabilistic neural network classifiers
title	Genome-enabled prediction using probabilistic neural network classifiers
title_full	Genome-enabled prediction using probabilistic neural network classifiers
title_fullStr	Genome-enabled prediction using probabilistic neural network classifiers
title_full_unstemmed	Genome-enabled prediction using probabilistic neural network classifiers
title_short	Genome-enabled prediction using probabilistic neural network classifiers
title_sort	genome-enabled prediction using probabilistic neural network classifiers
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4784384/ https://www.ncbi.nlm.nih.gov/pubmed/26956885 http://dx.doi.org/10.1186/s12864-016-2553-1
work_keys_str_mv	AT gonzalezcamachojuanmanuel genomeenabledpredictionusingprobabilisticneuralnetworkclassifiers AT crossajose genomeenabledpredictionusingprobabilisticneuralnetworkclassifiers AT perezrodriguezpaulino genomeenabledpredictionusingprobabilisticneuralnetworkclassifiers AT ornellaleonardo genomeenabledpredictionusingprobabilisticneuralnetworkclassifiers AT gianoladaniel genomeenabledpredictionusingprobabilisticneuralnetworkclassifiers

Genome-enabled prediction using probabilistic neural network classifiers

Ejemplares similares