Cargando…

PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms

Protein function prediction is gradually emerging as an essential field in biological and computational studies. Though the latter has clinched a significant footprint, it has been observed that the application of computational information gathered from multiple sources has more significant influenc...

Descripción completa

Detalles Bibliográficos
Autores principales: Sengupta, Kaustav, Saha, Sovan, Halder, Anup Kumar, Chatterjee, Piyali, Nasipuri, Mita, Basu, Subhadip, Plewczynski, Dariusz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9556876/
https://www.ncbi.nlm.nih.gov/pubmed/36246645
http://dx.doi.org/10.3389/fgene.2022.969915
_version_ 1784807172834590720
author Sengupta, Kaustav
Saha, Sovan
Halder, Anup Kumar
Chatterjee, Piyali
Nasipuri, Mita
Basu, Subhadip
Plewczynski, Dariusz
author_facet Sengupta, Kaustav
Saha, Sovan
Halder, Anup Kumar
Chatterjee, Piyali
Nasipuri, Mita
Basu, Subhadip
Plewczynski, Dariusz
author_sort Sengupta, Kaustav
collection PubMed
description Protein function prediction is gradually emerging as an essential field in biological and computational studies. Though the latter has clinched a significant footprint, it has been observed that the application of computational information gathered from multiple sources has more significant influence than the one derived from a single source. Considering this fact, a methodology, PFP-GO, is proposed where heterogeneous sources like Protein Sequence, Protein Domain, and Protein-Protein Interaction Network have been processed separately for ranking each individual functional GO term. Based on this ranking, GO terms are propagated to the target proteins. While Protein sequence enriches the sequence-based information, Protein Domain and Protein-Protein Interaction Networks embed structural/functional and topological based information, respectively, during the phase of GO ranking. Performance analysis of PFP-GO is also based on Precision, Recall, and F-Score. The same was found to perform reasonably better when compared to the other existing state-of-art. PFP-GO has achieved an overall Precision, Recall, and F-Score of 0.67, 0.58, and 0.62, respectively. Furthermore, we check some of the top-ranked GO terms predicted by PFP-GO through multilayer network propagation that affect the 3D structure of the genome. The complete source code of PFP-GO is freely available at https://sites.google.com/view/pfp-go/.
format Online
Article
Text
id pubmed-9556876
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95568762022-10-14 PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms Sengupta, Kaustav Saha, Sovan Halder, Anup Kumar Chatterjee, Piyali Nasipuri, Mita Basu, Subhadip Plewczynski, Dariusz Front Genet Genetics Protein function prediction is gradually emerging as an essential field in biological and computational studies. Though the latter has clinched a significant footprint, it has been observed that the application of computational information gathered from multiple sources has more significant influence than the one derived from a single source. Considering this fact, a methodology, PFP-GO, is proposed where heterogeneous sources like Protein Sequence, Protein Domain, and Protein-Protein Interaction Network have been processed separately for ranking each individual functional GO term. Based on this ranking, GO terms are propagated to the target proteins. While Protein sequence enriches the sequence-based information, Protein Domain and Protein-Protein Interaction Networks embed structural/functional and topological based information, respectively, during the phase of GO ranking. Performance analysis of PFP-GO is also based on Precision, Recall, and F-Score. The same was found to perform reasonably better when compared to the other existing state-of-art. PFP-GO has achieved an overall Precision, Recall, and F-Score of 0.67, 0.58, and 0.62, respectively. Furthermore, we check some of the top-ranked GO terms predicted by PFP-GO through multilayer network propagation that affect the 3D structure of the genome. The complete source code of PFP-GO is freely available at https://sites.google.com/view/pfp-go/. Frontiers Media S.A. 2022-09-29 /pmc/articles/PMC9556876/ /pubmed/36246645 http://dx.doi.org/10.3389/fgene.2022.969915 Text en Copyright © 2022 Sengupta, Saha, Halder, Chatterjee, Nasipuri, Basu and Plewczynski. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Sengupta, Kaustav
Saha, Sovan
Halder, Anup Kumar
Chatterjee, Piyali
Nasipuri, Mita
Basu, Subhadip
Plewczynski, Dariusz
PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms
title PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms
title_full PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms
title_fullStr PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms
title_full_unstemmed PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms
title_short PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms
title_sort pfp-go: integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked go terms
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9556876/
https://www.ncbi.nlm.nih.gov/pubmed/36246645
http://dx.doi.org/10.3389/fgene.2022.969915
work_keys_str_mv AT senguptakaustav pfpgointegratingproteinsequencedomainandproteinproteininteractioninformationforproteinfunctionpredictionusingrankedgoterms
AT sahasovan pfpgointegratingproteinsequencedomainandproteinproteininteractioninformationforproteinfunctionpredictionusingrankedgoterms
AT halderanupkumar pfpgointegratingproteinsequencedomainandproteinproteininteractioninformationforproteinfunctionpredictionusingrankedgoterms
AT chatterjeepiyali pfpgointegratingproteinsequencedomainandproteinproteininteractioninformationforproteinfunctionpredictionusingrankedgoterms
AT nasipurimita pfpgointegratingproteinsequencedomainandproteinproteininteractioninformationforproteinfunctionpredictionusingrankedgoterms
AT basusubhadip pfpgointegratingproteinsequencedomainandproteinproteininteractioninformationforproteinfunctionpredictionusingrankedgoterms
AT plewczynskidariusz pfpgointegratingproteinsequencedomainandproteinproteininteractioninformationforproteinfunctionpredictionusingrankedgoterms