Cargando…

Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures

SIMPLE SUMMARY: Gene expression data from different cancer types offer the opportunity to identify cancer tissue-of-origin specific biomarkers and targets. In this study, we used pan-cancer gene expression data to train a deep learning neural network model to identify cancer tissue-of-origin specifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Divate, Mayur, Tyagi, Aayush, Richard, Derek J., Prasad, Prathosh A., Gowda, Harsha, Nagaraj, Shivashankar H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8909043/
https://www.ncbi.nlm.nih.gov/pubmed/35267493
http://dx.doi.org/10.3390/cancers14051185
_version_ 1784666017544273920
author Divate, Mayur
Tyagi, Aayush
Richard, Derek J.
Prasad, Prathosh A.
Gowda, Harsha
Nagaraj, Shivashankar H.
author_facet Divate, Mayur
Tyagi, Aayush
Richard, Derek J.
Prasad, Prathosh A.
Gowda, Harsha
Nagaraj, Shivashankar H.
author_sort Divate, Mayur
collection PubMed
description SIMPLE SUMMARY: Gene expression data from different cancer types offer the opportunity to identify cancer tissue-of-origin specific biomarkers and targets. In this study, we used pan-cancer gene expression data to train a deep learning neural network model to identify cancer tissue-of-origin specific gene expression signatures. We identified 976 genes that can reliably classify different cancer types with >97% accuracy. ABSTRACT: Cancer tissue-of-origin specific biomarkers are needed for effective diagnosis, monitoring, and treatment of cancers. In this study, we analyzed transcriptomics data from 37 cancer types provided by The Cancer Genome Atlas (TCGA) to identify cancer tissue-of-origin specific gene expression signatures. We developed a deep neural network model to classify cancers based on gene expression data. The model achieved a predictive accuracy of >97% across cancer types indicating the presence of distinct cancer tissue-of-origin specific gene expression signatures. We interpreted the model using Shapley additive explanations to identify specific gene signatures that significantly contributed to cancer-type classification. We evaluated the model and the validity of gene signatures using an independent test data set from the International Cancer Genome Consortium. In conclusion, we present a robust neural network model for accurate classification of cancers based on gene expression data and also provide a list of gene signatures that are valuable for developing biomarker panels for determining cancer tissue-of-origin. These gene signatures serve as valuable biomarkers for determining tissue-of-origin for cancers of unknown primary.
format Online
Article
Text
id pubmed-8909043
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-89090432022-03-11 Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures Divate, Mayur Tyagi, Aayush Richard, Derek J. Prasad, Prathosh A. Gowda, Harsha Nagaraj, Shivashankar H. Cancers (Basel) Article SIMPLE SUMMARY: Gene expression data from different cancer types offer the opportunity to identify cancer tissue-of-origin specific biomarkers and targets. In this study, we used pan-cancer gene expression data to train a deep learning neural network model to identify cancer tissue-of-origin specific gene expression signatures. We identified 976 genes that can reliably classify different cancer types with >97% accuracy. ABSTRACT: Cancer tissue-of-origin specific biomarkers are needed for effective diagnosis, monitoring, and treatment of cancers. In this study, we analyzed transcriptomics data from 37 cancer types provided by The Cancer Genome Atlas (TCGA) to identify cancer tissue-of-origin specific gene expression signatures. We developed a deep neural network model to classify cancers based on gene expression data. The model achieved a predictive accuracy of >97% across cancer types indicating the presence of distinct cancer tissue-of-origin specific gene expression signatures. We interpreted the model using Shapley additive explanations to identify specific gene signatures that significantly contributed to cancer-type classification. We evaluated the model and the validity of gene signatures using an independent test data set from the International Cancer Genome Consortium. In conclusion, we present a robust neural network model for accurate classification of cancers based on gene expression data and also provide a list of gene signatures that are valuable for developing biomarker panels for determining cancer tissue-of-origin. These gene signatures serve as valuable biomarkers for determining tissue-of-origin for cancers of unknown primary. MDPI 2022-02-24 /pmc/articles/PMC8909043/ /pubmed/35267493 http://dx.doi.org/10.3390/cancers14051185 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Divate, Mayur
Tyagi, Aayush
Richard, Derek J.
Prasad, Prathosh A.
Gowda, Harsha
Nagaraj, Shivashankar H.
Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures
title Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures
title_full Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures
title_fullStr Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures
title_full_unstemmed Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures
title_short Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures
title_sort deep learning-based pan-cancer classification model reveals tissue-of-origin specific gene expression signatures
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8909043/
https://www.ncbi.nlm.nih.gov/pubmed/35267493
http://dx.doi.org/10.3390/cancers14051185
work_keys_str_mv AT divatemayur deeplearningbasedpancancerclassificationmodelrevealstissueoforiginspecificgeneexpressionsignatures
AT tyagiaayush deeplearningbasedpancancerclassificationmodelrevealstissueoforiginspecificgeneexpressionsignatures
AT richardderekj deeplearningbasedpancancerclassificationmodelrevealstissueoforiginspecificgeneexpressionsignatures
AT prasadprathosha deeplearningbasedpancancerclassificationmodelrevealstissueoforiginspecificgeneexpressionsignatures
AT gowdaharsha deeplearningbasedpancancerclassificationmodelrevealstissueoforiginspecificgeneexpressionsignatures
AT nagarajshivashankarh deeplearningbasedpancancerclassificationmodelrevealstissueoforiginspecificgeneexpressionsignatures