Cargando…

LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec

Viral infection involves a large number of protein–protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, bec...

Descripción completa

Detalles Bibliográficos
Autores principales: Tsukiyama, Sho, Hasan, Md Mehedi, Fujii, Satoshi, Kurata, Hiroyuki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8574953/
https://www.ncbi.nlm.nih.gov/pubmed/34160596
http://dx.doi.org/10.1093/bib/bbab228
_version_ 1784595593452060672
author Tsukiyama, Sho
Hasan, Md Mehedi
Fujii, Satoshi
Kurata, Hiroyuki
author_facet Tsukiyama, Sho
Hasan, Md Mehedi
Fujii, Satoshi
Kurata, Hiroyuki
author_sort Tsukiyama, Sho
collection PubMed
description Viral infection involves a large number of protein–protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, because experimental methods including mass spectrometry are time-consuming and expensive, and molecular dynamic simulation is limited only to the proteins whose 3D structures are solved. Sequence-based machine learning methods are expected to overcome these problems. We have first developed the LSTM model with word2vec to predict PPIs between human and virus, named LSTM-PHV, by using amino acid sequences alone. The LSTM-PHV effectively learnt the training data with a highly imbalanced ratio of positive to negative samples and achieved AUCs of 0.976 and 0.973 and accuracies of 0.984 and 0.985 on the training and independent datasets, respectively. In predicting PPIs between human and unknown or new virus, the LSTM-PHV learned greatly outperformed the existing state-of-the-art PPI predictors. Interestingly, learning of only sequence contexts as words is sufficient for PPI prediction. Use of uniform manifold approximation and projection demonstrated that the LSTM-PHV clearly distinguished the positive PPI samples from the negative ones. We presented the LSTM-PHV online web server and support data that are freely available at http://kurata35.bio.kyutech.ac.jp/LSTM-PHV.
format Online
Article
Text
id pubmed-8574953
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-85749532021-11-09 LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec Tsukiyama, Sho Hasan, Md Mehedi Fujii, Satoshi Kurata, Hiroyuki Brief Bioinform Problem Solving Protocol Viral infection involves a large number of protein–protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, because experimental methods including mass spectrometry are time-consuming and expensive, and molecular dynamic simulation is limited only to the proteins whose 3D structures are solved. Sequence-based machine learning methods are expected to overcome these problems. We have first developed the LSTM model with word2vec to predict PPIs between human and virus, named LSTM-PHV, by using amino acid sequences alone. The LSTM-PHV effectively learnt the training data with a highly imbalanced ratio of positive to negative samples and achieved AUCs of 0.976 and 0.973 and accuracies of 0.984 and 0.985 on the training and independent datasets, respectively. In predicting PPIs between human and unknown or new virus, the LSTM-PHV learned greatly outperformed the existing state-of-the-art PPI predictors. Interestingly, learning of only sequence contexts as words is sufficient for PPI prediction. Use of uniform manifold approximation and projection demonstrated that the LSTM-PHV clearly distinguished the positive PPI samples from the negative ones. We presented the LSTM-PHV online web server and support data that are freely available at http://kurata35.bio.kyutech.ac.jp/LSTM-PHV. Oxford University Press 2021-06-23 /pmc/articles/PMC8574953/ /pubmed/34160596 http://dx.doi.org/10.1093/bib/bbab228 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Problem Solving Protocol
Tsukiyama, Sho
Hasan, Md Mehedi
Fujii, Satoshi
Kurata, Hiroyuki
LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
title LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
title_full LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
title_fullStr LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
title_full_unstemmed LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
title_short LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
title_sort lstm-phv: prediction of human-virus protein–protein interactions by lstm with word2vec
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8574953/
https://www.ncbi.nlm.nih.gov/pubmed/34160596
http://dx.doi.org/10.1093/bib/bbab228
work_keys_str_mv AT tsukiyamasho lstmphvpredictionofhumanvirusproteinproteininteractionsbylstmwithword2vec
AT hasanmdmehedi lstmphvpredictionofhumanvirusproteinproteininteractionsbylstmwithword2vec
AT fujiisatoshi lstmphvpredictionofhumanvirusproteinproteininteractionsbylstmwithword2vec
AT kuratahiroyuki lstmphvpredictionofhumanvirusproteinproteininteractionsbylstmwithword2vec