Cargando…

Differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning

Hepatocellular carcinoma (HCC) is the most common lethal malignancy of the liver worldwide. Thus, it is important to dig the key genes for uncovering the molecular mechanisms and to improve diagnostic and therapeutic options for HCC. This study aimed to encompass a set of statistical and machine lea...

Descripción completa

Detalles Bibliográficos
Autores principales: Hasan, Md. Al Mehedi, Maniruzzaman, Md., Shin, Jungpil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9992474/
https://www.ncbi.nlm.nih.gov/pubmed/36882493
http://dx.doi.org/10.1038/s41598-023-30851-1
_version_ 1784902317647069184
author Hasan, Md. Al Mehedi
Maniruzzaman, Md.
Shin, Jungpil
author_facet Hasan, Md. Al Mehedi
Maniruzzaman, Md.
Shin, Jungpil
author_sort Hasan, Md. Al Mehedi
collection PubMed
description Hepatocellular carcinoma (HCC) is the most common lethal malignancy of the liver worldwide. Thus, it is important to dig the key genes for uncovering the molecular mechanisms and to improve diagnostic and therapeutic options for HCC. This study aimed to encompass a set of statistical and machine learning computational approaches for identifying the key candidate genes for HCC. Three microarray datasets were used in this work, which were downloaded from the Gene Expression Omnibus Database. At first, normalization and differentially expressed genes (DEGs) identification were performed using limma for each dataset. Then, support vector machine (SVM) was implemented to determine the differentially expressed discriminative genes (DEDGs) from DEGs of each dataset and select overlapping DEDGs genes among identified three sets of DEDGs. Enrichment analysis was performed on common DEDGs using DAVID. A protein-protein interaction (PPI) network was constructed using STRING and the central hub genes were identified depending on the degree, maximum neighborhood component (MNC), maximal clique centrality (MCC), centralities of closeness, and betweenness criteria using CytoHubba. Simultaneously, significant modules were selected using MCODE scores and identified their associated genes from the PPI networks. Moreover, metadata were created by listing all hub genes from previous studies and identified significant meta-hub genes whose occurrence frequency was greater than 3 among previous studies. Finally, six key candidate genes (TOP2A, CDC20, ASPM, PRC1, NUSAP1, and UBE2C) were determined by intersecting shared genes among central hub genes, hub module genes, and significant meta-hub genes. Two independent test datasets (GSE76427 and TCGA-LIHC) were utilized to validate these key candidate genes using the area under the curve. Moreover, the prognostic potential of these six key candidate genes was also evaluated on the TCGA-LIHC cohort using survival analysis.
format Online
Article
Text
id pubmed-9992474
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-99924742023-03-09 Differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning Hasan, Md. Al Mehedi Maniruzzaman, Md. Shin, Jungpil Sci Rep Article Hepatocellular carcinoma (HCC) is the most common lethal malignancy of the liver worldwide. Thus, it is important to dig the key genes for uncovering the molecular mechanisms and to improve diagnostic and therapeutic options for HCC. This study aimed to encompass a set of statistical and machine learning computational approaches for identifying the key candidate genes for HCC. Three microarray datasets were used in this work, which were downloaded from the Gene Expression Omnibus Database. At first, normalization and differentially expressed genes (DEGs) identification were performed using limma for each dataset. Then, support vector machine (SVM) was implemented to determine the differentially expressed discriminative genes (DEDGs) from DEGs of each dataset and select overlapping DEDGs genes among identified three sets of DEDGs. Enrichment analysis was performed on common DEDGs using DAVID. A protein-protein interaction (PPI) network was constructed using STRING and the central hub genes were identified depending on the degree, maximum neighborhood component (MNC), maximal clique centrality (MCC), centralities of closeness, and betweenness criteria using CytoHubba. Simultaneously, significant modules were selected using MCODE scores and identified their associated genes from the PPI networks. Moreover, metadata were created by listing all hub genes from previous studies and identified significant meta-hub genes whose occurrence frequency was greater than 3 among previous studies. Finally, six key candidate genes (TOP2A, CDC20, ASPM, PRC1, NUSAP1, and UBE2C) were determined by intersecting shared genes among central hub genes, hub module genes, and significant meta-hub genes. Two independent test datasets (GSE76427 and TCGA-LIHC) were utilized to validate these key candidate genes using the area under the curve. Moreover, the prognostic potential of these six key candidate genes was also evaluated on the TCGA-LIHC cohort using survival analysis. Nature Publishing Group UK 2023-03-07 /pmc/articles/PMC9992474/ /pubmed/36882493 http://dx.doi.org/10.1038/s41598-023-30851-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Hasan, Md. Al Mehedi
Maniruzzaman, Md.
Shin, Jungpil
Differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning
title Differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning
title_full Differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning
title_fullStr Differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning
title_full_unstemmed Differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning
title_short Differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning
title_sort differentially expressed discriminative genes and significant meta-hub genes based key genes identification for hepatocellular carcinoma using statistical machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9992474/
https://www.ncbi.nlm.nih.gov/pubmed/36882493
http://dx.doi.org/10.1038/s41598-023-30851-1
work_keys_str_mv AT hasanmdalmehedi differentiallyexpresseddiscriminativegenesandsignificantmetahubgenesbasedkeygenesidentificationforhepatocellularcarcinomausingstatisticalmachinelearning
AT maniruzzamanmd differentiallyexpresseddiscriminativegenesandsignificantmetahubgenesbasedkeygenesidentificationforhepatocellularcarcinomausingstatisticalmachinelearning
AT shinjungpil differentiallyexpresseddiscriminativegenesandsignificantmetahubgenesbasedkeygenesidentificationforhepatocellularcarcinomausingstatisticalmachinelearning