Cargando…

PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets

SUMMARY: Semantic web standards have shown importance in the last 20 years in promoting data formalization and interlinking between the existing knowledge graphs. In this context, several ontologies and data integration initiatives have emerged in recent years for the biological area, such as the br...

Descripción completa

Detalles Bibliográficos
Autores principales: Martins, Yasmmin Côrtes, Ziviani, Artur, Cerqueira e Costa, Maiana de Oliveira, Cavalcanti, Maria Cláudia Reis, Nicolás, Marisa Fabiana, de Vasconcelos, Ana Tereza Ribeiro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10290227/
https://www.ncbi.nlm.nih.gov/pubmed/37359724
http://dx.doi.org/10.1093/bioadv/vbad067
_version_ 1785062448707928064
author Martins, Yasmmin Côrtes
Ziviani, Artur
Cerqueira e Costa, Maiana de Oliveira
Cavalcanti, Maria Cláudia Reis
Nicolás, Marisa Fabiana
de Vasconcelos, Ana Tereza Ribeiro
author_facet Martins, Yasmmin Côrtes
Ziviani, Artur
Cerqueira e Costa, Maiana de Oliveira
Cavalcanti, Maria Cláudia Reis
Nicolás, Marisa Fabiana
de Vasconcelos, Ana Tereza Ribeiro
author_sort Martins, Yasmmin Côrtes
collection PubMed
description SUMMARY: Semantic web standards have shown importance in the last 20 years in promoting data formalization and interlinking between the existing knowledge graphs. In this context, several ontologies and data integration initiatives have emerged in recent years for the biological area, such as the broadly used Gene Ontology that contains metadata to annotate gene function and subcellular location. Another important subject in the biological area is protein–protein interactions (PPIs) which have applications like protein function inference. Current PPI databases have heterogeneous exportation methods that challenge their integration and analysis. Presently, several initiatives of ontologies covering some concepts of the PPI domain are available to promote interoperability across datasets. However, the efforts to stimulate guidelines for automatic semantic data integration and analysis for PPIs in these datasets are limited. Here, we present PPIntegrator, a system that semantically describes data related to protein interactions. We also introduce an enrichment pipeline to generate, predict and validate new potential host–pathogen datasets by transitivity analysis. PPIntegrator contains a data preparation module to organize data from three reference databases and a triplification and data fusion module to describe the provenance information and results. This work provides an overview of the PPIntegrator system applied to integrate and compare host–pathogen PPI datasets from four bacterial species using our proposed transitivity analysis pipeline. We also demonstrated some critical queries to analyze this kind of data and highlight the importance and usage of the semantic data generated by our system. AVAILABILITY AND IMPLEMENTATION: https://github.com/YasCoMa/ppintegrator, https://github.com/YasCoMa/ppi_validation_process and https://github.com/YasCoMa/predprin.
format Online
Article
Text
id pubmed-10290227
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-102902272023-06-25 PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets Martins, Yasmmin Côrtes Ziviani, Artur Cerqueira e Costa, Maiana de Oliveira Cavalcanti, Maria Cláudia Reis Nicolás, Marisa Fabiana de Vasconcelos, Ana Tereza Ribeiro Bioinform Adv Original Paper SUMMARY: Semantic web standards have shown importance in the last 20 years in promoting data formalization and interlinking between the existing knowledge graphs. In this context, several ontologies and data integration initiatives have emerged in recent years for the biological area, such as the broadly used Gene Ontology that contains metadata to annotate gene function and subcellular location. Another important subject in the biological area is protein–protein interactions (PPIs) which have applications like protein function inference. Current PPI databases have heterogeneous exportation methods that challenge their integration and analysis. Presently, several initiatives of ontologies covering some concepts of the PPI domain are available to promote interoperability across datasets. However, the efforts to stimulate guidelines for automatic semantic data integration and analysis for PPIs in these datasets are limited. Here, we present PPIntegrator, a system that semantically describes data related to protein interactions. We also introduce an enrichment pipeline to generate, predict and validate new potential host–pathogen datasets by transitivity analysis. PPIntegrator contains a data preparation module to organize data from three reference databases and a triplification and data fusion module to describe the provenance information and results. This work provides an overview of the PPIntegrator system applied to integrate and compare host–pathogen PPI datasets from four bacterial species using our proposed transitivity analysis pipeline. We also demonstrated some critical queries to analyze this kind of data and highlight the importance and usage of the semantic data generated by our system. AVAILABILITY AND IMPLEMENTATION: https://github.com/YasCoMa/ppintegrator, https://github.com/YasCoMa/ppi_validation_process and https://github.com/YasCoMa/predprin. Oxford University Press 2023-06-01 /pmc/articles/PMC10290227/ /pubmed/37359724 http://dx.doi.org/10.1093/bioadv/vbad067 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Martins, Yasmmin Côrtes
Ziviani, Artur
Cerqueira e Costa, Maiana de Oliveira
Cavalcanti, Maria Cláudia Reis
Nicolás, Marisa Fabiana
de Vasconcelos, Ana Tereza Ribeiro
PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets
title PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets
title_full PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets
title_fullStr PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets
title_full_unstemmed PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets
title_short PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets
title_sort ppintegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10290227/
https://www.ncbi.nlm.nih.gov/pubmed/37359724
http://dx.doi.org/10.1093/bioadv/vbad067
work_keys_str_mv AT martinsyasmmincortes ppintegratorsemanticintegrativesystemforproteinproteininteractionandapplicationforhostpathogendatasets
AT zivianiartur ppintegratorsemanticintegrativesystemforproteinproteininteractionandapplicationforhostpathogendatasets
AT cerqueiraecostamaianadeoliveira ppintegratorsemanticintegrativesystemforproteinproteininteractionandapplicationforhostpathogendatasets
AT cavalcantimariaclaudiareis ppintegratorsemanticintegrativesystemforproteinproteininteractionandapplicationforhostpathogendatasets
AT nicolasmarisafabiana ppintegratorsemanticintegrativesystemforproteinproteininteractionandapplicationforhostpathogendatasets
AT devasconcelosanaterezaribeiro ppintegratorsemanticintegrativesystemforproteinproteininteractionandapplicationforhostpathogendatasets