Cargando…

PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection

Biological data at the omics level are highly complex, requiring powerful computational approaches to identifying significant intrinsic characteristics to further search for informative markers involved in the studied phenotype. In this paper, we propose a novel dimension reduction technique, protei...

Descripción completa

Detalles Bibliográficos
Autores principales: Pati, Soumen Kumar, Gupta, Manan Kumar, Banerjee, Ayan, Mallik, Saurav, Zhao, Zhongming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10218330/
https://www.ncbi.nlm.nih.gov/pubmed/37239423
http://dx.doi.org/10.3390/genes14051063
_version_ 1785048748406079488
author Pati, Soumen Kumar
Gupta, Manan Kumar
Banerjee, Ayan
Mallik, Saurav
Zhao, Zhongming
author_facet Pati, Soumen Kumar
Gupta, Manan Kumar
Banerjee, Ayan
Mallik, Saurav
Zhao, Zhongming
author_sort Pati, Soumen Kumar
collection PubMed
description Biological data at the omics level are highly complex, requiring powerful computational approaches to identifying significant intrinsic characteristics to further search for informative markers involved in the studied phenotype. In this paper, we propose a novel dimension reduction technique, protein–protein interaction-based gene correlation filtration (PPIGCF), which builds on gene ontology (GO) and protein–protein interaction (PPI) structures to analyze microarray gene expression data. PPIGCF first extracts the gene symbols with their expression from the experimental dataset, and then, classifies them based on GO biological process (BP) and cellular component (CC) annotations. Every classification group inherits all the information on its CCs, corresponding to the BPs, to establish a PPI network. Then, the gene correlation filter (regarding gene rank and the proposed correlation coefficient) is computed on every network and eradicates a few weakly correlated genes connected with their corresponding networks. PPIGCF finds the information content (IC) of the other genes related to the PPI network and takes only the genes with the highest IC values. The satisfactory results of PPIGCF are used to prioritize significant genes. We performed a comparison with current methods to demonstrate our technique’s efficiency. From the experiment, it can be concluded that PPIGCF needs fewer genes to reach reasonable accuracy (~99%) for cancer classification. This paper reduces the computational complexity and enhances the time complexity of biomarker discovery from datasets.
format Online
Article
Text
id pubmed-10218330
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102183302023-05-27 PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection Pati, Soumen Kumar Gupta, Manan Kumar Banerjee, Ayan Mallik, Saurav Zhao, Zhongming Genes (Basel) Article Biological data at the omics level are highly complex, requiring powerful computational approaches to identifying significant intrinsic characteristics to further search for informative markers involved in the studied phenotype. In this paper, we propose a novel dimension reduction technique, protein–protein interaction-based gene correlation filtration (PPIGCF), which builds on gene ontology (GO) and protein–protein interaction (PPI) structures to analyze microarray gene expression data. PPIGCF first extracts the gene symbols with their expression from the experimental dataset, and then, classifies them based on GO biological process (BP) and cellular component (CC) annotations. Every classification group inherits all the information on its CCs, corresponding to the BPs, to establish a PPI network. Then, the gene correlation filter (regarding gene rank and the proposed correlation coefficient) is computed on every network and eradicates a few weakly correlated genes connected with their corresponding networks. PPIGCF finds the information content (IC) of the other genes related to the PPI network and takes only the genes with the highest IC values. The satisfactory results of PPIGCF are used to prioritize significant genes. We performed a comparison with current methods to demonstrate our technique’s efficiency. From the experiment, it can be concluded that PPIGCF needs fewer genes to reach reasonable accuracy (~99%) for cancer classification. This paper reduces the computational complexity and enhances the time complexity of biomarker discovery from datasets. MDPI 2023-05-10 /pmc/articles/PMC10218330/ /pubmed/37239423 http://dx.doi.org/10.3390/genes14051063 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Pati, Soumen Kumar
Gupta, Manan Kumar
Banerjee, Ayan
Mallik, Saurav
Zhao, Zhongming
PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection
title PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection
title_full PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection
title_fullStr PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection
title_full_unstemmed PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection
title_short PPIGCF: A Protein–Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection
title_sort ppigcf: a protein–protein interaction-based gene correlation filter for optimal gene selection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10218330/
https://www.ncbi.nlm.nih.gov/pubmed/37239423
http://dx.doi.org/10.3390/genes14051063
work_keys_str_mv AT patisoumenkumar ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection
AT guptamanankumar ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection
AT banerjeeayan ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection
AT malliksaurav ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection
AT zhaozhongming ppigcfaproteinproteininteractionbasedgenecorrelationfilterforoptimalgeneselection