Cargando…

Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data

BACKGROUND: In recent years, research in artificial neural networks has resurged, now under the deep-learning umbrella, and grown extremely popular. Recently reported success of DL techniques in crowd-sourced QSAR and predictive toxicology competitions has showcased these methods as powerful tools i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Koutsoukas, Alexios, Monaghan, Keith J., Li, Xiaoli, Huan, Jun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5489441/ https://www.ncbi.nlm.nih.gov/pubmed/29086090 http://dx.doi.org/10.1186/s13321-017-0226-y

_version_	1783246788727144448
author	Koutsoukas, Alexios Monaghan, Keith J. Li, Xiaoli Huan, Jun
author_facet	Koutsoukas, Alexios Monaghan, Keith J. Li, Xiaoli Huan, Jun
author_sort	Koutsoukas, Alexios
collection	PubMed
description	BACKGROUND: In recent years, research in artificial neural networks has resurged, now under the deep-learning umbrella, and grown extremely popular. Recently reported success of DL techniques in crowd-sourced QSAR and predictive toxicology competitions has showcased these methods as powerful tools in drug-discovery and toxicology research. The aim of this work was dual, first large number of hyper-parameter configurations were explored to investigate how they affect the performance of DNNs and could act as starting points when tuning DNNs and second their performance was compared to popular methods widely employed in the field of cheminformatics namely Naïve Bayes, k-nearest neighbor, random forest and support vector machines. Moreover, robustness of machine learning methods to different levels of artificially introduced noise was assessed. The open-source Caffe deep-learning framework and modern NVidia GPU units were utilized to carry out this study, allowing large number of DNN configurations to be explored. RESULTS: We show that feed-forward deep neural networks are capable of achieving strong classification performance and outperform shallow methods across diverse activity classes when optimized. Hyper-parameters that were found to play critical role are the activation function, dropout regularization, number hidden layers and number of neurons. When compared to the rest methods, tuned DNNs were found to statistically outperform, with p value <0.01 based on Wilcoxon statistical test. DNN achieved on average MCC units of 0.149 higher than NB, 0.092 than kNN, 0.052 than SVM with linear kernel, 0.021 than RF and finally 0.009 higher than SVM with radial basis function kernel. When exploring robustness to noise, non-linear methods were found to perform well when dealing with low levels of noise, lower than or equal to 20%, however when dealing with higher levels of noise, higher than 30%, the Naïve Bayes method was found to perform well and even outperform at the highest level of noise 50% more sophisticated methods across several datasets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-017-0226-y) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5489441
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-54894412017-07-13 Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data Koutsoukas, Alexios Monaghan, Keith J. Li, Xiaoli Huan, Jun J Cheminform Research Article BACKGROUND: In recent years, research in artificial neural networks has resurged, now under the deep-learning umbrella, and grown extremely popular. Recently reported success of DL techniques in crowd-sourced QSAR and predictive toxicology competitions has showcased these methods as powerful tools in drug-discovery and toxicology research. The aim of this work was dual, first large number of hyper-parameter configurations were explored to investigate how they affect the performance of DNNs and could act as starting points when tuning DNNs and second their performance was compared to popular methods widely employed in the field of cheminformatics namely Naïve Bayes, k-nearest neighbor, random forest and support vector machines. Moreover, robustness of machine learning methods to different levels of artificially introduced noise was assessed. The open-source Caffe deep-learning framework and modern NVidia GPU units were utilized to carry out this study, allowing large number of DNN configurations to be explored. RESULTS: We show that feed-forward deep neural networks are capable of achieving strong classification performance and outperform shallow methods across diverse activity classes when optimized. Hyper-parameters that were found to play critical role are the activation function, dropout regularization, number hidden layers and number of neurons. When compared to the rest methods, tuned DNNs were found to statistically outperform, with p value <0.01 based on Wilcoxon statistical test. DNN achieved on average MCC units of 0.149 higher than NB, 0.092 than kNN, 0.052 than SVM with linear kernel, 0.021 than RF and finally 0.009 higher than SVM with radial basis function kernel. When exploring robustness to noise, non-linear methods were found to perform well when dealing with low levels of noise, lower than or equal to 20%, however when dealing with higher levels of noise, higher than 30%, the Naïve Bayes method was found to perform well and even outperform at the highest level of noise 50% more sophisticated methods across several datasets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-017-0226-y) contains supplementary material, which is available to authorized users. Springer International Publishing 2017-06-28 /pmc/articles/PMC5489441/ /pubmed/29086090 http://dx.doi.org/10.1186/s13321-017-0226-y Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Koutsoukas, Alexios Monaghan, Keith J. Li, Xiaoli Huan, Jun Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data
title	Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data
title_full	Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data
title_fullStr	Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data
title_full_unstemmed	Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data
title_short	Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data
title_sort	deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5489441/ https://www.ncbi.nlm.nih.gov/pubmed/29086090 http://dx.doi.org/10.1186/s13321-017-0226-y
work_keys_str_mv	AT koutsoukasalexios deeplearninginvestigatingdeepneuralnetworkshyperparametersandcomparisonofperformancetoshallowmethodsformodelingbioactivitydata AT monaghankeithj deeplearninginvestigatingdeepneuralnetworkshyperparametersandcomparisonofperformancetoshallowmethodsformodelingbioactivitydata AT lixiaoli deeplearninginvestigatingdeepneuralnetworkshyperparametersandcomparisonofperformancetoshallowmethodsformodelingbioactivitydata AT huanjun deeplearninginvestigatingdeepneuralnetworkshyperparametersandcomparisonofperformancetoshallowmethodsformodelingbioactivitydata

Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data

Ejemplares similares