Cargando…

Wide and deep learning for automatic cell type identification

Cell type classification is an important problem in cancer research, especially with the advent of single cell technologies. Correctly identifying cells within the tumor microenvironment can provide oncologists with a snapshot of how a patient’s immune system reacts to the tumor. Wide and deep learn...

Descripción completa

Detalles Bibliográficos
Autores principales: Wilson, Christopher M., Fridley, Brooke L., Conejo-Garcia, José R., Wang, Xuefeng, Yu, Xiaoqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7878986/
https://www.ncbi.nlm.nih.gov/pubmed/33613870
http://dx.doi.org/10.1016/j.csbj.2021.01.027
_version_ 1783650438938099712
author Wilson, Christopher M.
Fridley, Brooke L.
Conejo-Garcia, José R.
Wang, Xuefeng
Yu, Xiaoqing
author_facet Wilson, Christopher M.
Fridley, Brooke L.
Conejo-Garcia, José R.
Wang, Xuefeng
Yu, Xiaoqing
author_sort Wilson, Christopher M.
collection PubMed
description Cell type classification is an important problem in cancer research, especially with the advent of single cell technologies. Correctly identifying cells within the tumor microenvironment can provide oncologists with a snapshot of how a patient’s immune system reacts to the tumor. Wide and deep learning (WDL) is an approach to construct a cell-classification prediction model that can learn patterns within high-dimensional data (deep) and ensure that biologically relevant features (wide) remain in the final model. In this paper, we demonstrate that regularization can prevent overfitting and adding a wide component to a neural network can result in a model with better predictive performance. In particular, we observed that a combination of dropout and [Formula: see text] regularization can lead to a validation loss function that does not depend on the number of training iterations and does not experience a significant decrease in prediction accuracy compared to models with [Formula: see text] , dropout, or no regularization. Additionally, we show WDL can have superior classification accuracy when the training and testing of a model are completed data on that arise from the same cancer type but different platforms. More specifically, WDL compared to traditional deep learning models can substantially increase the overall cell type prediction accuracy (36.5 to 86.9%) and T cell subtypes (CD4: 2.4 to 59.1%, and CD8: 19.5 to 96.1%) when the models were trained using melanoma data obtained from the 10X platform and tested on basal cell carcinoma data obtained using SMART-seq. WDL obtains higher accuracy when compared to state-of-the-art cell classification algorithms CHETAH (70.36%) and SingleR (70.59%).
format Online
Article
Text
id pubmed-7878986
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-78789862021-02-19 Wide and deep learning for automatic cell type identification Wilson, Christopher M. Fridley, Brooke L. Conejo-Garcia, José R. Wang, Xuefeng Yu, Xiaoqing Comput Struct Biotechnol J Research Article Cell type classification is an important problem in cancer research, especially with the advent of single cell technologies. Correctly identifying cells within the tumor microenvironment can provide oncologists with a snapshot of how a patient’s immune system reacts to the tumor. Wide and deep learning (WDL) is an approach to construct a cell-classification prediction model that can learn patterns within high-dimensional data (deep) and ensure that biologically relevant features (wide) remain in the final model. In this paper, we demonstrate that regularization can prevent overfitting and adding a wide component to a neural network can result in a model with better predictive performance. In particular, we observed that a combination of dropout and [Formula: see text] regularization can lead to a validation loss function that does not depend on the number of training iterations and does not experience a significant decrease in prediction accuracy compared to models with [Formula: see text] , dropout, or no regularization. Additionally, we show WDL can have superior classification accuracy when the training and testing of a model are completed data on that arise from the same cancer type but different platforms. More specifically, WDL compared to traditional deep learning models can substantially increase the overall cell type prediction accuracy (36.5 to 86.9%) and T cell subtypes (CD4: 2.4 to 59.1%, and CD8: 19.5 to 96.1%) when the models were trained using melanoma data obtained from the 10X platform and tested on basal cell carcinoma data obtained using SMART-seq. WDL obtains higher accuracy when compared to state-of-the-art cell classification algorithms CHETAH (70.36%) and SingleR (70.59%). Research Network of Computational and Structural Biotechnology 2021-01-26 /pmc/articles/PMC7878986/ /pubmed/33613870 http://dx.doi.org/10.1016/j.csbj.2021.01.027 Text en © 2021 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Wilson, Christopher M.
Fridley, Brooke L.
Conejo-Garcia, José R.
Wang, Xuefeng
Yu, Xiaoqing
Wide and deep learning for automatic cell type identification
title Wide and deep learning for automatic cell type identification
title_full Wide and deep learning for automatic cell type identification
title_fullStr Wide and deep learning for automatic cell type identification
title_full_unstemmed Wide and deep learning for automatic cell type identification
title_short Wide and deep learning for automatic cell type identification
title_sort wide and deep learning for automatic cell type identification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7878986/
https://www.ncbi.nlm.nih.gov/pubmed/33613870
http://dx.doi.org/10.1016/j.csbj.2021.01.027
work_keys_str_mv AT wilsonchristopherm wideanddeeplearningforautomaticcelltypeidentification
AT fridleybrookel wideanddeeplearningforautomaticcelltypeidentification
AT conejogarciajoser wideanddeeplearningforautomaticcelltypeidentification
AT wangxuefeng wideanddeeplearningforautomaticcelltypeidentification
AT yuxiaoqing wideanddeeplearningforautomaticcelltypeidentification