Cargando…
PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection
Biological data at the omics level are highly complex, requiring powerful computational approaches to identifying significant intrinsic characteristics to further search for informative markers involved in the studied phenotype. In this paper, we propose a novel dimension reduction technique, protei...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10218330/ https://www.ncbi.nlm.nih.gov/pubmed/37239423 http://dx.doi.org/10.3390/genes14051063 |
_version_ | 1785048748406079488 |
---|---|
author | Pati, Soumen Kumar Gupta, Manan Kumar Banerjee, Ayan Mallik, Saurav Zhao, Zhongming |
author_facet | Pati, Soumen Kumar Gupta, Manan Kumar Banerjee, Ayan Mallik, Saurav Zhao, Zhongming |
author_sort | Pati, Soumen Kumar |
collection | PubMed |
description | Biological data at the omics level are highly complex, requiring powerful computational approaches to identifying significant intrinsic characteristics to further search for informative markers involved in the studied phenotype. In this paper, we propose a novel dimension reduction technique, protein–protein interaction-based gene correlation filtration (PPIGCF), which builds on gene ontology (GO) and protein–protein interaction (PPI) structures to analyze microarray gene expression data. PPIGCF first extracts the gene symbols with their expression from the experimental dataset, and then, classifies them based on GO biological process (BP) and cellular component (CC) annotations. Every classification group inherits all the information on its CCs, corresponding to the BPs, to establish a PPI network. Then, the gene correlation filter (regarding gene rank and the proposed correlation coefficient) is computed on every network and eradicates a few weakly correlated genes connected with their corresponding networks. PPIGCF finds the information content (IC) of the other genes related to the PPI network and takes only the genes with the highest IC values. The satisfactory results of PPIGCF are used to prioritize significant genes. We performed a comparison with current methods to demonstrate our technique’s efficiency. From the experiment, it can be concluded that PPIGCF needs fewer genes to reach reasonable accuracy (~99%) for cancer classification. This paper reduces the computational complexity and enhances the time complexity of biomarker discovery from datasets. |
format | Online Article Text |
id | pubmed-10218330 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-102183302023-05-27 PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection Pati, Soumen Kumar Gupta, Manan Kumar Banerjee, Ayan Mallik, Saurav Zhao, Zhongming Genes (Basel) Article Biological data at the omics level are highly complex, requiring powerful computational approaches to identifying significant intrinsic characteristics to further search for informative markers involved in the studied phenotype. In this paper, we propose a novel dimension reduction technique, protein–protein interaction-based gene correlation filtration (PPIGCF), which builds on gene ontology (GO) and protein–protein interaction (PPI) structures to analyze microarray gene expression data. PPIGCF first extracts the gene symbols with their expression from the experimental dataset, and then, classifies them based on GO biological process (BP) and cellular component (CC) annotations. Every classification group inherits all the information on its CCs, corresponding to the BPs, to establish a PPI network. Then, the gene correlation filter (regarding gene rank and the proposed correlation coefficient) is computed on every network and eradicates a few weakly correlated genes connected with their corresponding networks. PPIGCF finds the information content (IC) of the other genes related to the PPI network and takes only the genes with the highest IC values. The satisfactory results of PPIGCF are used to prioritize significant genes. We performed a comparison with current methods to demonstrate our technique’s efficiency. From the experiment, it can be concluded that PPIGCF needs fewer genes to reach reasonable accuracy (~99%) for cancer classification. This paper reduces the computational complexity and enhances the time complexity of biomarker discovery from datasets. MDPI 2023-05-10 /pmc/articles/PMC10218330/ /pubmed/37239423 http://dx.doi.org/10.3390/genes14051063 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Pati, Soumen Kumar Gupta, Manan Kumar Banerjee, Ayan Mallik, Saurav Zhao, Zhongming PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection |
title | PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection |
title_full | PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection |
title_fullStr | PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection |
title_full_unstemmed | PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection |
title_short | PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection |
title_sort | ppigcf: a protein–protein interaction-based gene correlation filter for optimal gene selection |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10218330/ https://www.ncbi.nlm.nih.gov/pubmed/37239423 http://dx.doi.org/10.3390/genes14051063 |
work_keys_str_mv | AT patisoumenkumar ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection AT guptamanankumar ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection AT banerjeeayan ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection AT malliksaurav ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection AT zhaozhongming ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection |