Cargando…

Identification of gene profiles related to the development of oral cancer using a deep learning technique

BACKGROUND: Oral cancer (OC) is a debilitating disease that can affect the quality of life of these patients adversely. Oral premalignant lesion patients have a high risk of developing OC. Therefore, identifying robust survival subgroups among them may significantly improve patient therapy and care....

Descripción completa

Detalles Bibliográficos
Autores principales: Tapak, Leili, Ghasemi, Mohammad Kazem, Afshar, Saeid, Mahjub, Hossein, Soltanian, Alireza, Khotanlou, Hassan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972685/
https://www.ncbi.nlm.nih.gov/pubmed/36849997
http://dx.doi.org/10.1186/s12920-023-01462-6
_version_ 1784898370567929856
author Tapak, Leili
Ghasemi, Mohammad Kazem
Afshar, Saeid
Mahjub, Hossein
Soltanian, Alireza
Khotanlou, Hassan
author_facet Tapak, Leili
Ghasemi, Mohammad Kazem
Afshar, Saeid
Mahjub, Hossein
Soltanian, Alireza
Khotanlou, Hassan
author_sort Tapak, Leili
collection PubMed
description BACKGROUND: Oral cancer (OC) is a debilitating disease that can affect the quality of life of these patients adversely. Oral premalignant lesion patients have a high risk of developing OC. Therefore, identifying robust survival subgroups among them may significantly improve patient therapy and care. This study aimed to identify prognostic biomarkers that predict the time-to-development of OC and survival stratification for patients using state-of-the-art machine learning and deep learning. METHODS: Gene expression profiles (29,096 probes) related to 86 patients from the GSE26549 dataset from the GEO repository were used. An autoencoder deep learning neural network model was used to extract features. We also used a univariate Cox regression model to select significant features obtained from the deep learning method (P < 0.05). High-risk and low-risk groups were then identified using a hierarchical clustering technique based on 100 encoded features (the number of units of the encoding layer, i.e., bottleneck of the network) from autoencoder and selected by Cox proportional hazards model and a supervised random forest (RF) classifier was used to identify gene profiles related to subtypes of OC from the original 29,096 probes. RESULTS: Among 100 encoded features extracted by autoencoder, seventy features were significantly related to time-to-OC-development, based on the univariate Cox model, which was used as the inputs for the clustering of patients. Two survival risk groups were identified (P value of log-rank test = 0.003) and were used as the labels for supervised classification. The overall accuracy of the RF classifier was 0.916 over the test set, yielded 21 top genes (FUT8-DDR2-ATM-CD247-ETS1-ZEB2-COL5A2-GMAP7-CDH1-COL11A2-COL3A1-AHR-COL2A1-CHORDC1-PTP4A3-COL1A2-CCR2-PDGFRB-COL1A1-FERMT2-PIK3CB) associated with time to developing OC, selected among the original 29,096 probes. CONCLUSIONS: Using deep learning, our study identified prominent transcriptional biomarkers in determining high-risk patients for developing oral cancer, which may be prognostic as significant targets for OC therapy. The identified genes may serve as potential targets for oral cancer chemoprevention. Additional validation of these biomarkers in experimental prospective and retrospective studies will launch them in OC clinics.
format Online
Article
Text
id pubmed-9972685
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-99726852023-03-01 Identification of gene profiles related to the development of oral cancer using a deep learning technique Tapak, Leili Ghasemi, Mohammad Kazem Afshar, Saeid Mahjub, Hossein Soltanian, Alireza Khotanlou, Hassan BMC Med Genomics Research BACKGROUND: Oral cancer (OC) is a debilitating disease that can affect the quality of life of these patients adversely. Oral premalignant lesion patients have a high risk of developing OC. Therefore, identifying robust survival subgroups among them may significantly improve patient therapy and care. This study aimed to identify prognostic biomarkers that predict the time-to-development of OC and survival stratification for patients using state-of-the-art machine learning and deep learning. METHODS: Gene expression profiles (29,096 probes) related to 86 patients from the GSE26549 dataset from the GEO repository were used. An autoencoder deep learning neural network model was used to extract features. We also used a univariate Cox regression model to select significant features obtained from the deep learning method (P < 0.05). High-risk and low-risk groups were then identified using a hierarchical clustering technique based on 100 encoded features (the number of units of the encoding layer, i.e., bottleneck of the network) from autoencoder and selected by Cox proportional hazards model and a supervised random forest (RF) classifier was used to identify gene profiles related to subtypes of OC from the original 29,096 probes. RESULTS: Among 100 encoded features extracted by autoencoder, seventy features were significantly related to time-to-OC-development, based on the univariate Cox model, which was used as the inputs for the clustering of patients. Two survival risk groups were identified (P value of log-rank test = 0.003) and were used as the labels for supervised classification. The overall accuracy of the RF classifier was 0.916 over the test set, yielded 21 top genes (FUT8-DDR2-ATM-CD247-ETS1-ZEB2-COL5A2-GMAP7-CDH1-COL11A2-COL3A1-AHR-COL2A1-CHORDC1-PTP4A3-COL1A2-CCR2-PDGFRB-COL1A1-FERMT2-PIK3CB) associated with time to developing OC, selected among the original 29,096 probes. CONCLUSIONS: Using deep learning, our study identified prominent transcriptional biomarkers in determining high-risk patients for developing oral cancer, which may be prognostic as significant targets for OC therapy. The identified genes may serve as potential targets for oral cancer chemoprevention. Additional validation of these biomarkers in experimental prospective and retrospective studies will launch them in OC clinics. BioMed Central 2023-02-27 /pmc/articles/PMC9972685/ /pubmed/36849997 http://dx.doi.org/10.1186/s12920-023-01462-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Tapak, Leili
Ghasemi, Mohammad Kazem
Afshar, Saeid
Mahjub, Hossein
Soltanian, Alireza
Khotanlou, Hassan
Identification of gene profiles related to the development of oral cancer using a deep learning technique
title Identification of gene profiles related to the development of oral cancer using a deep learning technique
title_full Identification of gene profiles related to the development of oral cancer using a deep learning technique
title_fullStr Identification of gene profiles related to the development of oral cancer using a deep learning technique
title_full_unstemmed Identification of gene profiles related to the development of oral cancer using a deep learning technique
title_short Identification of gene profiles related to the development of oral cancer using a deep learning technique
title_sort identification of gene profiles related to the development of oral cancer using a deep learning technique
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972685/
https://www.ncbi.nlm.nih.gov/pubmed/36849997
http://dx.doi.org/10.1186/s12920-023-01462-6
work_keys_str_mv AT tapakleili identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique
AT ghasemimohammadkazem identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique
AT afsharsaeid identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique
AT mahjubhossein identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique
AT soltanianalireza identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique
AT khotanlouhassan identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique