Cargando…

Architectures and accuracy of artificial neural network for disease classification from omics data

BACKGROUND: Deep learning has made tremendous successes in numerous artificial intelligence applications and is unsurprisingly penetrating into various biomedical domains. High-throughput omics data in the form of molecular profile matrices, such as transcriptomes and metabolomes, have long existed...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yu, Hui, Samuels, David C., Zhao, Ying-yong, Guo, Yan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6399893/ https://www.ncbi.nlm.nih.gov/pubmed/30832569 http://dx.doi.org/10.1186/s12864-019-5546-z

_version_	1783399834660634624
author	Yu, Hui Samuels, David C. Zhao, Ying-yong Guo, Yan
author_facet	Yu, Hui Samuels, David C. Zhao, Ying-yong Guo, Yan
author_sort	Yu, Hui
collection	PubMed
description	BACKGROUND: Deep learning has made tremendous successes in numerous artificial intelligence applications and is unsurprisingly penetrating into various biomedical domains. High-throughput omics data in the form of molecular profile matrices, such as transcriptomes and metabolomes, have long existed as a valuable resource for facilitating diagnosis of patient statuses/stages. It is timely imperative to compare deep learning neural networks against classical machine learning methods in the setting of matrix-formed omics data in terms of classification accuracy and robustness. RESULTS: Using 37 high throughput omics datasets, covering transcriptomes and metabolomes, we evaluated the classification power of deep learning compared to traditional machine learning methods. Representative deep learning methods, Multi-Layer Perceptrons (MLP) and Convolutional Neural Networks (CNN), were deployed and explored in seeking optimal architectures for the best classification performance. Together with five classical supervised classification methods (Linear Discriminant Analysis, Multinomial Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machine), MLP and CNN were comparatively tested on the 37 datasets to predict disease stages or to discriminate diseased samples from normal samples. MLPs achieved the highest overall accuracy among all methods tested. More thorough analyses revealed that single hidden layer MLPs with ample hidden units outperformed deeper MLPs. Furthermore, MLP was one of the most robust methods against imbalanced class composition and inaccurate class labels. CONCLUSION: Our results concluded that shallow MLPs (of one or two hidden layers) with ample hidden neurons are sufficient to achieve superior and robust classification performance in exploiting numerical matrix-formed omics data for diagnosis purpose. Specific observations regarding optimal network width, class imbalance tolerance, and inaccurate labeling tolerance will inform future improvement of neural network applications on functional genomics data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5546-z) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-6399893
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-63998932019-03-13 Architectures and accuracy of artificial neural network for disease classification from omics data Yu, Hui Samuels, David C. Zhao, Ying-yong Guo, Yan BMC Genomics Research Article BACKGROUND: Deep learning has made tremendous successes in numerous artificial intelligence applications and is unsurprisingly penetrating into various biomedical domains. High-throughput omics data in the form of molecular profile matrices, such as transcriptomes and metabolomes, have long existed as a valuable resource for facilitating diagnosis of patient statuses/stages. It is timely imperative to compare deep learning neural networks against classical machine learning methods in the setting of matrix-formed omics data in terms of classification accuracy and robustness. RESULTS: Using 37 high throughput omics datasets, covering transcriptomes and metabolomes, we evaluated the classification power of deep learning compared to traditional machine learning methods. Representative deep learning methods, Multi-Layer Perceptrons (MLP) and Convolutional Neural Networks (CNN), were deployed and explored in seeking optimal architectures for the best classification performance. Together with five classical supervised classification methods (Linear Discriminant Analysis, Multinomial Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machine), MLP and CNN were comparatively tested on the 37 datasets to predict disease stages or to discriminate diseased samples from normal samples. MLPs achieved the highest overall accuracy among all methods tested. More thorough analyses revealed that single hidden layer MLPs with ample hidden units outperformed deeper MLPs. Furthermore, MLP was one of the most robust methods against imbalanced class composition and inaccurate class labels. CONCLUSION: Our results concluded that shallow MLPs (of one or two hidden layers) with ample hidden neurons are sufficient to achieve superior and robust classification performance in exploiting numerical matrix-formed omics data for diagnosis purpose. Specific observations regarding optimal network width, class imbalance tolerance, and inaccurate labeling tolerance will inform future improvement of neural network applications on functional genomics data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5546-z) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-04 /pmc/articles/PMC6399893/ /pubmed/30832569 http://dx.doi.org/10.1186/s12864-019-5546-z Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Yu, Hui Samuels, David C. Zhao, Ying-yong Guo, Yan Architectures and accuracy of artificial neural network for disease classification from omics data
title	Architectures and accuracy of artificial neural network for disease classification from omics data
title_full	Architectures and accuracy of artificial neural network for disease classification from omics data
title_fullStr	Architectures and accuracy of artificial neural network for disease classification from omics data
title_full_unstemmed	Architectures and accuracy of artificial neural network for disease classification from omics data
title_short	Architectures and accuracy of artificial neural network for disease classification from omics data
title_sort	architectures and accuracy of artificial neural network for disease classification from omics data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6399893/ https://www.ncbi.nlm.nih.gov/pubmed/30832569 http://dx.doi.org/10.1186/s12864-019-5546-z
work_keys_str_mv	AT yuhui architecturesandaccuracyofartificialneuralnetworkfordiseaseclassificationfromomicsdata AT samuelsdavidc architecturesandaccuracyofartificialneuralnetworkfordiseaseclassificationfromomicsdata AT zhaoyingyong architecturesandaccuracyofartificialneuralnetworkfordiseaseclassificationfromomicsdata AT guoyan architecturesandaccuracyofartificialneuralnetworkfordiseaseclassificationfromomicsdata

Architectures and accuracy of artificial neural network for disease classification from omics data

Ejemplares similares