Cargando…

Identification of all-against-all protein–protein interactions based on deep hash learning

BACKGROUND: Protein–protein interaction (PPI) is vital for life processes, disease treatment, and drug discovery. The computational prediction of PPI is relatively inexpensive and efficient when compared to traditional wet-lab experiments. Given a new protein, one may wish to find whether the protei...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jiang, Yue, Wang, Yuxuan, Shen, Lin, Adjeroh, Donald A., Liu, Zhidong, Lin, Jie
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9264577/ https://www.ncbi.nlm.nih.gov/pubmed/35804303 http://dx.doi.org/10.1186/s12859-022-04811-x

_version_	1784742993501093888
author	Jiang, Yue Wang, Yuxuan Shen, Lin Adjeroh, Donald A. Liu, Zhidong Lin, Jie
author_facet	Jiang, Yue Wang, Yuxuan Shen, Lin Adjeroh, Donald A. Liu, Zhidong Lin, Jie
author_sort	Jiang, Yue
collection	PubMed
description	BACKGROUND: Protein–protein interaction (PPI) is vital for life processes, disease treatment, and drug discovery. The computational prediction of PPI is relatively inexpensive and efficient when compared to traditional wet-lab experiments. Given a new protein, one may wish to find whether the protein has any PPI relationship with other existing proteins. Current computational PPI prediction methods usually compare the new protein to existing proteins one by one in a pairwise manner. This is time consuming. RESULTS: In this work, we propose a more efficient model, called deep hash learning protein-and-protein interaction (DHL-PPI), to predict all-against-all PPI relationships in a database of proteins. First, DHL-PPI encodes a protein sequence into a binary hash code based on deep features extracted from the protein sequences using deep learning techniques. This encoding scheme enables us to turn the PPI discrimination problem into a much simpler searching problem. The binary hash code for a protein sequence can be regarded as a number. Thus, in the pre-screening stage of DHL-PPI, the string matching problem of comparing a protein sequence against a database with M proteins can be transformed into a much more simpler problem: to find a number inside a sorted array of length M. This pre-screening process narrows down the search to a much smaller set of candidate proteins for further confirmation. As a final step, DHL-PPI uses the Hamming distance to verify the final PPI relationship. CONCLUSIONS: The experimental results confirmed that DHL-PPI is feasible and effective. Using a dataset with strictly negative PPI examples of four species, DHL-PPI is shown to be superior or competitive when compared to the other state-of-the-art methods in terms of precision, recall or F1 score. Furthermore, in the prediction stage, the proposed DHL-PPI reduced the time complexity from [Formula: see text] to [Formula: see text] for performing an all-against-all PPI prediction for a database with M proteins. With the proposed approach, a protein database can be preprocessed and stored for later search using the proposed encoding scheme. This can provide a more efficient way to cope with the rapidly increasing volume of protein datasets.
format	Online Article Text
id	pubmed-9264577
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-92645772022-07-09 Identification of all-against-all protein–protein interactions based on deep hash learning Jiang, Yue Wang, Yuxuan Shen, Lin Adjeroh, Donald A. Liu, Zhidong Lin, Jie BMC Bioinformatics Research BACKGROUND: Protein–protein interaction (PPI) is vital for life processes, disease treatment, and drug discovery. The computational prediction of PPI is relatively inexpensive and efficient when compared to traditional wet-lab experiments. Given a new protein, one may wish to find whether the protein has any PPI relationship with other existing proteins. Current computational PPI prediction methods usually compare the new protein to existing proteins one by one in a pairwise manner. This is time consuming. RESULTS: In this work, we propose a more efficient model, called deep hash learning protein-and-protein interaction (DHL-PPI), to predict all-against-all PPI relationships in a database of proteins. First, DHL-PPI encodes a protein sequence into a binary hash code based on deep features extracted from the protein sequences using deep learning techniques. This encoding scheme enables us to turn the PPI discrimination problem into a much simpler searching problem. The binary hash code for a protein sequence can be regarded as a number. Thus, in the pre-screening stage of DHL-PPI, the string matching problem of comparing a protein sequence against a database with M proteins can be transformed into a much more simpler problem: to find a number inside a sorted array of length M. This pre-screening process narrows down the search to a much smaller set of candidate proteins for further confirmation. As a final step, DHL-PPI uses the Hamming distance to verify the final PPI relationship. CONCLUSIONS: The experimental results confirmed that DHL-PPI is feasible and effective. Using a dataset with strictly negative PPI examples of four species, DHL-PPI is shown to be superior or competitive when compared to the other state-of-the-art methods in terms of precision, recall or F1 score. Furthermore, in the prediction stage, the proposed DHL-PPI reduced the time complexity from [Formula: see text] to [Formula: see text] for performing an all-against-all PPI prediction for a database with M proteins. With the proposed approach, a protein database can be preprocessed and stored for later search using the proposed encoding scheme. This can provide a more efficient way to cope with the rapidly increasing volume of protein datasets. BioMed Central 2022-07-08 /pmc/articles/PMC9264577/ /pubmed/35804303 http://dx.doi.org/10.1186/s12859-022-04811-x Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Jiang, Yue Wang, Yuxuan Shen, Lin Adjeroh, Donald A. Liu, Zhidong Lin, Jie Identification of all-against-all protein–protein interactions based on deep hash learning
title	Identification of all-against-all protein–protein interactions based on deep hash learning
title_full	Identification of all-against-all protein–protein interactions based on deep hash learning
title_fullStr	Identification of all-against-all protein–protein interactions based on deep hash learning
title_full_unstemmed	Identification of all-against-all protein–protein interactions based on deep hash learning
title_short	Identification of all-against-all protein–protein interactions based on deep hash learning
title_sort	identification of all-against-all protein–protein interactions based on deep hash learning
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9264577/ https://www.ncbi.nlm.nih.gov/pubmed/35804303 http://dx.doi.org/10.1186/s12859-022-04811-x
work_keys_str_mv	AT jiangyue identificationofallagainstallproteinproteininteractionsbasedondeephashlearning AT wangyuxuan identificationofallagainstallproteinproteininteractionsbasedondeephashlearning AT shenlin identificationofallagainstallproteinproteininteractionsbasedondeephashlearning AT adjerohdonalda identificationofallagainstallproteinproteininteractionsbasedondeephashlearning AT liuzhidong identificationofallagainstallproteinproteininteractionsbasedondeephashlearning AT linjie identificationofallagainstallproteinproteininteractionsbasedondeephashlearning

Identification of all-against-all protein–protein interactions based on deep hash learning

Ejemplares similares