Cargando…
Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach
SIMPLE SUMMARY: Protein–protein interactions (PPIs) are the basis for understanding cellular events in biological systems. Experimental biochemical, molecular, and genetic methods have been used to identify protein–protein associations. However, they are time-consuming and expensive. Machine learnin...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9856098/ https://www.ncbi.nlm.nih.gov/pubmed/36671832 http://dx.doi.org/10.3390/biology12010140 |
_version_ | 1784873539844702208 |
---|---|
author | Ortiz-Vilchis, Pilar De-la-Cruz-García, Jazmin-Susana Ramirez-Arellano, Aldo |
author_facet | Ortiz-Vilchis, Pilar De-la-Cruz-García, Jazmin-Susana Ramirez-Arellano, Aldo |
author_sort | Ortiz-Vilchis, Pilar |
collection | PubMed |
description | SIMPLE SUMMARY: Protein–protein interactions (PPIs) are the basis for understanding cellular events in biological systems. Experimental biochemical, molecular, and genetic methods have been used to identify protein–protein associations. However, they are time-consuming and expensive. Machine learning techniques have been used to characterize PPIs, optimizing time and resources. This study aimed to generate a relevant protein sequence with partial knowledge of interactions by conducting a scale-free and fractal analysis. The outcome of these analyses is then used to fine-tune the fractal method for the vital protein extraction of PPI networks. The results show that several PPI networks are self-similar or fractal, but not both of them. The generated protein sequences by the deep learning network contains an important number of proteins of the original sequence. Moreover, most of the PPIs of generated sequences appear in the original set. This information can help researchers guide experimental design and find key points for new therapeutics. ABSTRACT: Protein–protein interactions (PPIs) are the basis for understanding most cellular events in biological systems. Several experimental methods, e.g., biochemical, molecular, and genetic methods, have been used to identify protein–protein associations. However, some of them, such as mass spectrometry, are time-consuming and expensive. Machine learning (ML) techniques have been widely used to characterize PPIs, increasing the number of proteins analyzed simultaneously and optimizing time and resources for identifying and predicting protein–protein functional linkages. Previous ML approaches have focused on well-known networks or specific targets but not on identifying relevant proteins with partial or null knowledge of the interaction networks. The proposed approach aims to generate a relevant protein sequence based on bidirectional Long-Short Term Memory (LSTM) with partial knowledge of interactions. The general framework comprises conducting a scale-free and fractal complex network analysis. The outcome of these analyses is then used to fine-tune the fractal method for the vital protein extraction of PPI networks. The results show that several PPI networks are self-similar or fractal, but that both features cannot coexist. The generated protein sequences (by the bidirectional LSTM) also contain an average of 39.5% of proteins in the original sequence. The average length of the generated sequences was 17% of the original one. Finally, 95% of the generated sequences were true. |
format | Online Article Text |
id | pubmed-9856098 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-98560982023-01-21 Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach Ortiz-Vilchis, Pilar De-la-Cruz-García, Jazmin-Susana Ramirez-Arellano, Aldo Biology (Basel) Article SIMPLE SUMMARY: Protein–protein interactions (PPIs) are the basis for understanding cellular events in biological systems. Experimental biochemical, molecular, and genetic methods have been used to identify protein–protein associations. However, they are time-consuming and expensive. Machine learning techniques have been used to characterize PPIs, optimizing time and resources. This study aimed to generate a relevant protein sequence with partial knowledge of interactions by conducting a scale-free and fractal analysis. The outcome of these analyses is then used to fine-tune the fractal method for the vital protein extraction of PPI networks. The results show that several PPI networks are self-similar or fractal, but not both of them. The generated protein sequences by the deep learning network contains an important number of proteins of the original sequence. Moreover, most of the PPIs of generated sequences appear in the original set. This information can help researchers guide experimental design and find key points for new therapeutics. ABSTRACT: Protein–protein interactions (PPIs) are the basis for understanding most cellular events in biological systems. Several experimental methods, e.g., biochemical, molecular, and genetic methods, have been used to identify protein–protein associations. However, some of them, such as mass spectrometry, are time-consuming and expensive. Machine learning (ML) techniques have been widely used to characterize PPIs, increasing the number of proteins analyzed simultaneously and optimizing time and resources for identifying and predicting protein–protein functional linkages. Previous ML approaches have focused on well-known networks or specific targets but not on identifying relevant proteins with partial or null knowledge of the interaction networks. The proposed approach aims to generate a relevant protein sequence based on bidirectional Long-Short Term Memory (LSTM) with partial knowledge of interactions. The general framework comprises conducting a scale-free and fractal complex network analysis. The outcome of these analyses is then used to fine-tune the fractal method for the vital protein extraction of PPI networks. The results show that several PPI networks are self-similar or fractal, but that both features cannot coexist. The generated protein sequences (by the bidirectional LSTM) also contain an average of 39.5% of proteins in the original sequence. The average length of the generated sequences was 17% of the original one. Finally, 95% of the generated sequences were true. MDPI 2023-01-16 /pmc/articles/PMC9856098/ /pubmed/36671832 http://dx.doi.org/10.3390/biology12010140 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Ortiz-Vilchis, Pilar De-la-Cruz-García, Jazmin-Susana Ramirez-Arellano, Aldo Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach |
title | Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach |
title_full | Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach |
title_fullStr | Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach |
title_full_unstemmed | Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach |
title_short | Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach |
title_sort | identification of relevant protein interactions with partial knowledge: a complex network and deep learning approach |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9856098/ https://www.ncbi.nlm.nih.gov/pubmed/36671832 http://dx.doi.org/10.3390/biology12010140 |
work_keys_str_mv | AT ortizvilchispilar identificationofrelevantproteininteractionswithpartialknowledgeacomplexnetworkanddeeplearningapproach AT delacruzgarciajazminsusana identificationofrelevantproteininteractionswithpartialknowledgeacomplexnetworkanddeeplearningapproach AT ramirezarellanoaldo identificationofrelevantproteininteractionswithpartialknowledgeacomplexnetworkanddeeplearningapproach |