Cargando…
LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
Viral infection involves a large number of protein–protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, bec...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8574953/ https://www.ncbi.nlm.nih.gov/pubmed/34160596 http://dx.doi.org/10.1093/bib/bbab228 |
_version_ | 1784595593452060672 |
---|---|
author | Tsukiyama, Sho Hasan, Md Mehedi Fujii, Satoshi Kurata, Hiroyuki |
author_facet | Tsukiyama, Sho Hasan, Md Mehedi Fujii, Satoshi Kurata, Hiroyuki |
author_sort | Tsukiyama, Sho |
collection | PubMed |
description | Viral infection involves a large number of protein–protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, because experimental methods including mass spectrometry are time-consuming and expensive, and molecular dynamic simulation is limited only to the proteins whose 3D structures are solved. Sequence-based machine learning methods are expected to overcome these problems. We have first developed the LSTM model with word2vec to predict PPIs between human and virus, named LSTM-PHV, by using amino acid sequences alone. The LSTM-PHV effectively learnt the training data with a highly imbalanced ratio of positive to negative samples and achieved AUCs of 0.976 and 0.973 and accuracies of 0.984 and 0.985 on the training and independent datasets, respectively. In predicting PPIs between human and unknown or new virus, the LSTM-PHV learned greatly outperformed the existing state-of-the-art PPI predictors. Interestingly, learning of only sequence contexts as words is sufficient for PPI prediction. Use of uniform manifold approximation and projection demonstrated that the LSTM-PHV clearly distinguished the positive PPI samples from the negative ones. We presented the LSTM-PHV online web server and support data that are freely available at http://kurata35.bio.kyutech.ac.jp/LSTM-PHV. |
format | Online Article Text |
id | pubmed-8574953 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-85749532021-11-09 LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec Tsukiyama, Sho Hasan, Md Mehedi Fujii, Satoshi Kurata, Hiroyuki Brief Bioinform Problem Solving Protocol Viral infection involves a large number of protein–protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, because experimental methods including mass spectrometry are time-consuming and expensive, and molecular dynamic simulation is limited only to the proteins whose 3D structures are solved. Sequence-based machine learning methods are expected to overcome these problems. We have first developed the LSTM model with word2vec to predict PPIs between human and virus, named LSTM-PHV, by using amino acid sequences alone. The LSTM-PHV effectively learnt the training data with a highly imbalanced ratio of positive to negative samples and achieved AUCs of 0.976 and 0.973 and accuracies of 0.984 and 0.985 on the training and independent datasets, respectively. In predicting PPIs between human and unknown or new virus, the LSTM-PHV learned greatly outperformed the existing state-of-the-art PPI predictors. Interestingly, learning of only sequence contexts as words is sufficient for PPI prediction. Use of uniform manifold approximation and projection demonstrated that the LSTM-PHV clearly distinguished the positive PPI samples from the negative ones. We presented the LSTM-PHV online web server and support data that are freely available at http://kurata35.bio.kyutech.ac.jp/LSTM-PHV. Oxford University Press 2021-06-23 /pmc/articles/PMC8574953/ /pubmed/34160596 http://dx.doi.org/10.1093/bib/bbab228 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Problem Solving Protocol Tsukiyama, Sho Hasan, Md Mehedi Fujii, Satoshi Kurata, Hiroyuki LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec |
title | LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec |
title_full | LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec |
title_fullStr | LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec |
title_full_unstemmed | LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec |
title_short | LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec |
title_sort | lstm-phv: prediction of human-virus protein–protein interactions by lstm with word2vec |
topic | Problem Solving Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8574953/ https://www.ncbi.nlm.nih.gov/pubmed/34160596 http://dx.doi.org/10.1093/bib/bbab228 |
work_keys_str_mv | AT tsukiyamasho lstmphvpredictionofhumanvirusproteinproteininteractionsbylstmwithword2vec AT hasanmdmehedi lstmphvpredictionofhumanvirusproteinproteininteractionsbylstmwithword2vec AT fujiisatoshi lstmphvpredictionofhumanvirusproteinproteininteractionsbylstmwithword2vec AT kuratahiroyuki lstmphvpredictionofhumanvirusproteinproteininteractionsbylstmwithword2vec |