Cargando…
Identification of gene profiles related to the development of oral cancer using a deep learning technique
BACKGROUND: Oral cancer (OC) is a debilitating disease that can affect the quality of life of these patients adversely. Oral premalignant lesion patients have a high risk of developing OC. Therefore, identifying robust survival subgroups among them may significantly improve patient therapy and care....
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972685/ https://www.ncbi.nlm.nih.gov/pubmed/36849997 http://dx.doi.org/10.1186/s12920-023-01462-6 |
_version_ | 1784898370567929856 |
---|---|
author | Tapak, Leili Ghasemi, Mohammad Kazem Afshar, Saeid Mahjub, Hossein Soltanian, Alireza Khotanlou, Hassan |
author_facet | Tapak, Leili Ghasemi, Mohammad Kazem Afshar, Saeid Mahjub, Hossein Soltanian, Alireza Khotanlou, Hassan |
author_sort | Tapak, Leili |
collection | PubMed |
description | BACKGROUND: Oral cancer (OC) is a debilitating disease that can affect the quality of life of these patients adversely. Oral premalignant lesion patients have a high risk of developing OC. Therefore, identifying robust survival subgroups among them may significantly improve patient therapy and care. This study aimed to identify prognostic biomarkers that predict the time-to-development of OC and survival stratification for patients using state-of-the-art machine learning and deep learning. METHODS: Gene expression profiles (29,096 probes) related to 86 patients from the GSE26549 dataset from the GEO repository were used. An autoencoder deep learning neural network model was used to extract features. We also used a univariate Cox regression model to select significant features obtained from the deep learning method (P < 0.05). High-risk and low-risk groups were then identified using a hierarchical clustering technique based on 100 encoded features (the number of units of the encoding layer, i.e., bottleneck of the network) from autoencoder and selected by Cox proportional hazards model and a supervised random forest (RF) classifier was used to identify gene profiles related to subtypes of OC from the original 29,096 probes. RESULTS: Among 100 encoded features extracted by autoencoder, seventy features were significantly related to time-to-OC-development, based on the univariate Cox model, which was used as the inputs for the clustering of patients. Two survival risk groups were identified (P value of log-rank test = 0.003) and were used as the labels for supervised classification. The overall accuracy of the RF classifier was 0.916 over the test set, yielded 21 top genes (FUT8-DDR2-ATM-CD247-ETS1-ZEB2-COL5A2-GMAP7-CDH1-COL11A2-COL3A1-AHR-COL2A1-CHORDC1-PTP4A3-COL1A2-CCR2-PDGFRB-COL1A1-FERMT2-PIK3CB) associated with time to developing OC, selected among the original 29,096 probes. CONCLUSIONS: Using deep learning, our study identified prominent transcriptional biomarkers in determining high-risk patients for developing oral cancer, which may be prognostic as significant targets for OC therapy. The identified genes may serve as potential targets for oral cancer chemoprevention. Additional validation of these biomarkers in experimental prospective and retrospective studies will launch them in OC clinics. |
format | Online Article Text |
id | pubmed-9972685 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-99726852023-03-01 Identification of gene profiles related to the development of oral cancer using a deep learning technique Tapak, Leili Ghasemi, Mohammad Kazem Afshar, Saeid Mahjub, Hossein Soltanian, Alireza Khotanlou, Hassan BMC Med Genomics Research BACKGROUND: Oral cancer (OC) is a debilitating disease that can affect the quality of life of these patients adversely. Oral premalignant lesion patients have a high risk of developing OC. Therefore, identifying robust survival subgroups among them may significantly improve patient therapy and care. This study aimed to identify prognostic biomarkers that predict the time-to-development of OC and survival stratification for patients using state-of-the-art machine learning and deep learning. METHODS: Gene expression profiles (29,096 probes) related to 86 patients from the GSE26549 dataset from the GEO repository were used. An autoencoder deep learning neural network model was used to extract features. We also used a univariate Cox regression model to select significant features obtained from the deep learning method (P < 0.05). High-risk and low-risk groups were then identified using a hierarchical clustering technique based on 100 encoded features (the number of units of the encoding layer, i.e., bottleneck of the network) from autoencoder and selected by Cox proportional hazards model and a supervised random forest (RF) classifier was used to identify gene profiles related to subtypes of OC from the original 29,096 probes. RESULTS: Among 100 encoded features extracted by autoencoder, seventy features were significantly related to time-to-OC-development, based on the univariate Cox model, which was used as the inputs for the clustering of patients. Two survival risk groups were identified (P value of log-rank test = 0.003) and were used as the labels for supervised classification. The overall accuracy of the RF classifier was 0.916 over the test set, yielded 21 top genes (FUT8-DDR2-ATM-CD247-ETS1-ZEB2-COL5A2-GMAP7-CDH1-COL11A2-COL3A1-AHR-COL2A1-CHORDC1-PTP4A3-COL1A2-CCR2-PDGFRB-COL1A1-FERMT2-PIK3CB) associated with time to developing OC, selected among the original 29,096 probes. CONCLUSIONS: Using deep learning, our study identified prominent transcriptional biomarkers in determining high-risk patients for developing oral cancer, which may be prognostic as significant targets for OC therapy. The identified genes may serve as potential targets for oral cancer chemoprevention. Additional validation of these biomarkers in experimental prospective and retrospective studies will launch them in OC clinics. BioMed Central 2023-02-27 /pmc/articles/PMC9972685/ /pubmed/36849997 http://dx.doi.org/10.1186/s12920-023-01462-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Tapak, Leili Ghasemi, Mohammad Kazem Afshar, Saeid Mahjub, Hossein Soltanian, Alireza Khotanlou, Hassan Identification of gene profiles related to the development of oral cancer using a deep learning technique |
title | Identification of gene profiles related to the development of oral cancer using a deep learning technique |
title_full | Identification of gene profiles related to the development of oral cancer using a deep learning technique |
title_fullStr | Identification of gene profiles related to the development of oral cancer using a deep learning technique |
title_full_unstemmed | Identification of gene profiles related to the development of oral cancer using a deep learning technique |
title_short | Identification of gene profiles related to the development of oral cancer using a deep learning technique |
title_sort | identification of gene profiles related to the development of oral cancer using a deep learning technique |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972685/ https://www.ncbi.nlm.nih.gov/pubmed/36849997 http://dx.doi.org/10.1186/s12920-023-01462-6 |
work_keys_str_mv | AT tapakleili identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique AT ghasemimohammadkazem identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique AT afsharsaeid identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique AT mahjubhossein identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique AT soltanianalireza identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique AT khotanlouhassan identificationofgeneprofilesrelatedtothedevelopmentoforalcancerusingadeeplearningtechnique |