Cargando…

Fully-Connected Neural Networks with Reduced Parameterization for Predicting Histological Types of Lung Cancer from Somatic Mutations

Several challenges appear in the application of deep learning to genomic data. First, the dimensionality of input can be orders of magnitude greater than the number of samples, forcing the model to be prone to overfitting the training dataset. Second, each input variable’s contribution to the predic...

Descripción completa

Detalles Bibliográficos
Autores principales: Kobayashi, Kazuma, Bolatkan, Amina, Shiina, Shuichiro, Hamamoto, Ryuji
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7563438/
https://www.ncbi.nlm.nih.gov/pubmed/32872133
http://dx.doi.org/10.3390/biom10091249
_version_ 1783595488906313728
author Kobayashi, Kazuma
Bolatkan, Amina
Shiina, Shuichiro
Hamamoto, Ryuji
author_facet Kobayashi, Kazuma
Bolatkan, Amina
Shiina, Shuichiro
Hamamoto, Ryuji
author_sort Kobayashi, Kazuma
collection PubMed
description Several challenges appear in the application of deep learning to genomic data. First, the dimensionality of input can be orders of magnitude greater than the number of samples, forcing the model to be prone to overfitting the training dataset. Second, each input variable’s contribution to the prediction is usually difficult to interpret, owing to multiple nonlinear operations. Third, genetic data features sometimes have no innate structure. To alleviate these problems, we propose a modification to Diet Networks by adding element-wise input scaling. The original Diet Networks concept can considerably reduce the number of parameters of the fully-connected layers by taking the transposed data matrix as an input to its auxiliary network. The efficacy of the proposed architecture was evaluated on a binary classification task for lung cancer histology, that is, adenocarcinoma or squamous cell carcinoma, from a somatic mutation profile. The dataset consisted of 950 cases, and 5-fold cross-validation was performed for evaluating the model performance. The model achieved a prediction accuracy of around 80% and showed that our modification markedly stabilized the learning process. Also, latent representations acquired inside the model allowed us to interpret the relationship between somatic mutation sites for the prediction.
format Online
Article
Text
id pubmed-7563438
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75634382020-10-27 Fully-Connected Neural Networks with Reduced Parameterization for Predicting Histological Types of Lung Cancer from Somatic Mutations Kobayashi, Kazuma Bolatkan, Amina Shiina, Shuichiro Hamamoto, Ryuji Biomolecules Article Several challenges appear in the application of deep learning to genomic data. First, the dimensionality of input can be orders of magnitude greater than the number of samples, forcing the model to be prone to overfitting the training dataset. Second, each input variable’s contribution to the prediction is usually difficult to interpret, owing to multiple nonlinear operations. Third, genetic data features sometimes have no innate structure. To alleviate these problems, we propose a modification to Diet Networks by adding element-wise input scaling. The original Diet Networks concept can considerably reduce the number of parameters of the fully-connected layers by taking the transposed data matrix as an input to its auxiliary network. The efficacy of the proposed architecture was evaluated on a binary classification task for lung cancer histology, that is, adenocarcinoma or squamous cell carcinoma, from a somatic mutation profile. The dataset consisted of 950 cases, and 5-fold cross-validation was performed for evaluating the model performance. The model achieved a prediction accuracy of around 80% and showed that our modification markedly stabilized the learning process. Also, latent representations acquired inside the model allowed us to interpret the relationship between somatic mutation sites for the prediction. MDPI 2020-08-28 /pmc/articles/PMC7563438/ /pubmed/32872133 http://dx.doi.org/10.3390/biom10091249 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Kobayashi, Kazuma
Bolatkan, Amina
Shiina, Shuichiro
Hamamoto, Ryuji
Fully-Connected Neural Networks with Reduced Parameterization for Predicting Histological Types of Lung Cancer from Somatic Mutations
title Fully-Connected Neural Networks with Reduced Parameterization for Predicting Histological Types of Lung Cancer from Somatic Mutations
title_full Fully-Connected Neural Networks with Reduced Parameterization for Predicting Histological Types of Lung Cancer from Somatic Mutations
title_fullStr Fully-Connected Neural Networks with Reduced Parameterization for Predicting Histological Types of Lung Cancer from Somatic Mutations
title_full_unstemmed Fully-Connected Neural Networks with Reduced Parameterization for Predicting Histological Types of Lung Cancer from Somatic Mutations
title_short Fully-Connected Neural Networks with Reduced Parameterization for Predicting Histological Types of Lung Cancer from Somatic Mutations
title_sort fully-connected neural networks with reduced parameterization for predicting histological types of lung cancer from somatic mutations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7563438/
https://www.ncbi.nlm.nih.gov/pubmed/32872133
http://dx.doi.org/10.3390/biom10091249
work_keys_str_mv AT kobayashikazuma fullyconnectedneuralnetworkswithreducedparameterizationforpredictinghistologicaltypesoflungcancerfromsomaticmutations
AT bolatkanamina fullyconnectedneuralnetworkswithreducedparameterizationforpredictinghistologicaltypesoflungcancerfromsomaticmutations
AT shiinashuichiro fullyconnectedneuralnetworkswithreducedparameterizationforpredictinghistologicaltypesoflungcancerfromsomaticmutations
AT hamamotoryuji fullyconnectedneuralnetworkswithreducedparameterizationforpredictinghistologicaltypesoflungcancerfromsomaticmutations