Cargando…

Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity

Complex diseases, such as breast cancer, are often caused by mutations of multiple functional genes. Identifying disease-related genes is a critical and challenging task for unveiling the biological mechanisms behind these diseases. In this study, we develop a novel computational framework to analyz...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yan, Xiang, Ju, Tang, Liang, Li, Jianming, Lu, Qingqing, Tian, Geng, He, Bin-Sheng, Yang, Jialiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415302/
https://www.ncbi.nlm.nih.gov/pubmed/34484285
http://dx.doi.org/10.3389/fgene.2021.596794
_version_ 1783747940526850048
author Zhang, Yan
Xiang, Ju
Tang, Liang
Li, Jianming
Lu, Qingqing
Tian, Geng
He, Bin-Sheng
Yang, Jialiang
author_facet Zhang, Yan
Xiang, Ju
Tang, Liang
Li, Jianming
Lu, Qingqing
Tian, Geng
He, Bin-Sheng
Yang, Jialiang
author_sort Zhang, Yan
collection PubMed
description Complex diseases, such as breast cancer, are often caused by mutations of multiple functional genes. Identifying disease-related genes is a critical and challenging task for unveiling the biological mechanisms behind these diseases. In this study, we develop a novel computational framework to analyze the network properties of the known breast cancer–associated genes, based on which we develop a random-walk-with-restart (RCRWR) algorithm to predict novel disease genes. Specifically, we first curated a set of breast cancer–associated genes from the Genome-Wide Association Studies catalog and Online Mendelian Inheritance in Man database and then studied the distribution of these genes on an integrated protein–protein interaction (PPI) network. We found that the breast cancer–associated genes are significantly closer to each other than random, which confirms the modularity property of disease genes in a PPI network as revealed by previous studies. We then retrieved PPI subnetworks spanning top breast cancer–associated KEGG pathways and found that the distribution of these genes on the subnetworks are non-random, suggesting that these KEGG pathways are activated non-uniformly. Taking advantage of the non-random distribution of breast cancer–associated genes, we developed an improved RCRWR algorithm to predict novel cancer genes, which integrates network reconstruction based on local random walk dynamics and subnetworks spanning KEGG pathways. Compared with the disease gene prediction without using the information from the KEGG pathways, this method has a better prediction performance on inferring breast cancer–associated genes, and the top predicted genes are better enriched on known breast cancer–associated gene ontologies. Finally, we performed a literature search on top predicted novel genes and found that most of them are supported by at least wet-lab experiments on cell lines. In summary, we propose a robust computational framework to prioritize novel breast cancer–associated genes, which could be used for further in vitro and in vivo experimental validation.
format Online
Article
Text
id pubmed-8415302
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-84153022021-09-04 Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity Zhang, Yan Xiang, Ju Tang, Liang Li, Jianming Lu, Qingqing Tian, Geng He, Bin-Sheng Yang, Jialiang Front Genet Genetics Complex diseases, such as breast cancer, are often caused by mutations of multiple functional genes. Identifying disease-related genes is a critical and challenging task for unveiling the biological mechanisms behind these diseases. In this study, we develop a novel computational framework to analyze the network properties of the known breast cancer–associated genes, based on which we develop a random-walk-with-restart (RCRWR) algorithm to predict novel disease genes. Specifically, we first curated a set of breast cancer–associated genes from the Genome-Wide Association Studies catalog and Online Mendelian Inheritance in Man database and then studied the distribution of these genes on an integrated protein–protein interaction (PPI) network. We found that the breast cancer–associated genes are significantly closer to each other than random, which confirms the modularity property of disease genes in a PPI network as revealed by previous studies. We then retrieved PPI subnetworks spanning top breast cancer–associated KEGG pathways and found that the distribution of these genes on the subnetworks are non-random, suggesting that these KEGG pathways are activated non-uniformly. Taking advantage of the non-random distribution of breast cancer–associated genes, we developed an improved RCRWR algorithm to predict novel cancer genes, which integrates network reconstruction based on local random walk dynamics and subnetworks spanning KEGG pathways. Compared with the disease gene prediction without using the information from the KEGG pathways, this method has a better prediction performance on inferring breast cancer–associated genes, and the top predicted genes are better enriched on known breast cancer–associated gene ontologies. Finally, we performed a literature search on top predicted novel genes and found that most of them are supported by at least wet-lab experiments on cell lines. In summary, we propose a robust computational framework to prioritize novel breast cancer–associated genes, which could be used for further in vitro and in vivo experimental validation. Frontiers Media S.A. 2021-08-16 /pmc/articles/PMC8415302/ /pubmed/34484285 http://dx.doi.org/10.3389/fgene.2021.596794 Text en Copyright © 2021 Zhang, Xiang, Tang, Li, Lu, Tian, He and Yang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhang, Yan
Xiang, Ju
Tang, Liang
Li, Jianming
Lu, Qingqing
Tian, Geng
He, Bin-Sheng
Yang, Jialiang
Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity
title Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity
title_full Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity
title_fullStr Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity
title_full_unstemmed Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity
title_short Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity
title_sort identifying breast cancer-related genes based on a novel computational framework involving kegg pathways and ppi network modularity
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415302/
https://www.ncbi.nlm.nih.gov/pubmed/34484285
http://dx.doi.org/10.3389/fgene.2021.596794
work_keys_str_mv AT zhangyan identifyingbreastcancerrelatedgenesbasedonanovelcomputationalframeworkinvolvingkeggpathwaysandppinetworkmodularity
AT xiangju identifyingbreastcancerrelatedgenesbasedonanovelcomputationalframeworkinvolvingkeggpathwaysandppinetworkmodularity
AT tangliang identifyingbreastcancerrelatedgenesbasedonanovelcomputationalframeworkinvolvingkeggpathwaysandppinetworkmodularity
AT lijianming identifyingbreastcancerrelatedgenesbasedonanovelcomputationalframeworkinvolvingkeggpathwaysandppinetworkmodularity
AT luqingqing identifyingbreastcancerrelatedgenesbasedonanovelcomputationalframeworkinvolvingkeggpathwaysandppinetworkmodularity
AT tiangeng identifyingbreastcancerrelatedgenesbasedonanovelcomputationalframeworkinvolvingkeggpathwaysandppinetworkmodularity
AT hebinsheng identifyingbreastcancerrelatedgenesbasedonanovelcomputationalframeworkinvolvingkeggpathwaysandppinetworkmodularity
AT yangjialiang identifyingbreastcancerrelatedgenesbasedonanovelcomputationalframeworkinvolvingkeggpathwaysandppinetworkmodularity