Cargando…

Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway

Protein–protein interactions (PPIs) are extremely important for gaining mechanistic insights into the functional organization of the proteome. The resolution of PPI functions can help in the identification of novel diagnostic and therapeutic targets with medical utility, thus facilitating the develo...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Lili, Zhang, Yu-Hang, Huang, FeiMing, Li, ZhanDong, Huang, Tao, Cai, Yu-Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9511048/
https://www.ncbi.nlm.nih.gov/pubmed/36171880
http://dx.doi.org/10.3389/fgene.2022.1011659
_version_ 1784797576480948224
author Yang, Lili
Zhang, Yu-Hang
Huang, FeiMing
Li, ZhanDong
Huang, Tao
Cai, Yu-Dong
author_facet Yang, Lili
Zhang, Yu-Hang
Huang, FeiMing
Li, ZhanDong
Huang, Tao
Cai, Yu-Dong
author_sort Yang, Lili
collection PubMed
description Protein–protein interactions (PPIs) are extremely important for gaining mechanistic insights into the functional organization of the proteome. The resolution of PPI functions can help in the identification of novel diagnostic and therapeutic targets with medical utility, thus facilitating the development of new medications. However, the traditional methods for resolving PPI functions are mainly experimental methods, such as co-immunoprecipitation, pull-down assays, cross-linking, label transfer, and far-Western blot analysis, that are not only expensive but also time-consuming. In this study, we constructed an integrated feature selection scheme for the large-scale selection of the relevant functions of PPIs by using the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotations of PPI participants. First, we encoded the proteins in each PPI with their gene ontologies and KEGG pathways. Then, the encoded protein features were refined as features of both positive and negative PPIs. Subsequently, Boruta was used for the initial filtering of features to obtain 5684 features. Three feature ranking algorithms, namely, least absolute shrinkage and selection operator, light gradient boosting machine, and max-relevance and min-redundancy, were applied to evaluate feature importance. Finally, the top-ranked features derived from multiple datasets were comprehensively evaluated, and the intersection of results mined by three feature ranking algorithms was taken to identify the features with high correlation with PPIs. Some functional terms were identified in our study, including cytokine–cytokine receptor interaction (hsa04060), intrinsic component of membrane (GO:0031224), and protein-binding biological process (GO:0005515). Our newly proposed integrated computational approach offers a novel perspective of the large-scale mining of biological functions linked to PPI.
format Online
Article
Text
id pubmed-9511048
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95110482022-09-27 Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway Yang, Lili Zhang, Yu-Hang Huang, FeiMing Li, ZhanDong Huang, Tao Cai, Yu-Dong Front Genet Genetics Protein–protein interactions (PPIs) are extremely important for gaining mechanistic insights into the functional organization of the proteome. The resolution of PPI functions can help in the identification of novel diagnostic and therapeutic targets with medical utility, thus facilitating the development of new medications. However, the traditional methods for resolving PPI functions are mainly experimental methods, such as co-immunoprecipitation, pull-down assays, cross-linking, label transfer, and far-Western blot analysis, that are not only expensive but also time-consuming. In this study, we constructed an integrated feature selection scheme for the large-scale selection of the relevant functions of PPIs by using the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotations of PPI participants. First, we encoded the proteins in each PPI with their gene ontologies and KEGG pathways. Then, the encoded protein features were refined as features of both positive and negative PPIs. Subsequently, Boruta was used for the initial filtering of features to obtain 5684 features. Three feature ranking algorithms, namely, least absolute shrinkage and selection operator, light gradient boosting machine, and max-relevance and min-redundancy, were applied to evaluate feature importance. Finally, the top-ranked features derived from multiple datasets were comprehensively evaluated, and the intersection of results mined by three feature ranking algorithms was taken to identify the features with high correlation with PPIs. Some functional terms were identified in our study, including cytokine–cytokine receptor interaction (hsa04060), intrinsic component of membrane (GO:0031224), and protein-binding biological process (GO:0005515). Our newly proposed integrated computational approach offers a novel perspective of the large-scale mining of biological functions linked to PPI. Frontiers Media S.A. 2022-09-12 /pmc/articles/PMC9511048/ /pubmed/36171880 http://dx.doi.org/10.3389/fgene.2022.1011659 Text en Copyright © 2022 Yang, Zhang, Huang, Li, Huang and Cai. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Yang, Lili
Zhang, Yu-Hang
Huang, FeiMing
Li, ZhanDong
Huang, Tao
Cai, Yu-Dong
Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway
title Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway
title_full Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway
title_fullStr Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway
title_full_unstemmed Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway
title_short Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway
title_sort identification of protein–protein interaction associated functions based on gene ontology and kegg pathway
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9511048/
https://www.ncbi.nlm.nih.gov/pubmed/36171880
http://dx.doi.org/10.3389/fgene.2022.1011659
work_keys_str_mv AT yanglili identificationofproteinproteininteractionassociatedfunctionsbasedongeneontologyandkeggpathway
AT zhangyuhang identificationofproteinproteininteractionassociatedfunctionsbasedongeneontologyandkeggpathway
AT huangfeiming identificationofproteinproteininteractionassociatedfunctionsbasedongeneontologyandkeggpathway
AT lizhandong identificationofproteinproteininteractionassociatedfunctionsbasedongeneontologyandkeggpathway
AT huangtao identificationofproteinproteininteractionassociatedfunctionsbasedongeneontologyandkeggpathway
AT caiyudong identificationofproteinproteininteractionassociatedfunctionsbasedongeneontologyandkeggpathway