Cargando…

Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods

BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. METHODS: In this study...

Descripción completa

Detalles Bibliográficos
Autores principales: Barman, Ranjan Kumar, Saha, Sudipto, Das, Santasabuj
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4223108/
https://www.ncbi.nlm.nih.gov/pubmed/25375323
http://dx.doi.org/10.1371/journal.pone.0112034
_version_ 1782343166344560640
author Barman, Ranjan Kumar
Saha, Sudipto
Das, Santasabuj
author_facet Barman, Ranjan Kumar
Saha, Sudipto
Das, Santasabuj
author_sort Barman, Ranjan Kumar
collection PubMed
description BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. METHODS: In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS: Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49%) and Random Forest (55.66%). However the specificity of Naïve Bayes was the highest (99.52%) as compared with SVM (74%) and Random Forest (89.08%). Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus “C protein” binds to membrane docking protein, while “X protein” and “P protein” interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION: The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV), interacting partners of host protein were identified using optimised SVM model.
format Online
Article
Text
id pubmed-4223108
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42231082014-11-13 Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods Barman, Ranjan Kumar Saha, Sudipto Das, Santasabuj PLoS One Research Article BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. METHODS: In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS: Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49%) and Random Forest (55.66%). However the specificity of Naïve Bayes was the highest (99.52%) as compared with SVM (74%) and Random Forest (89.08%). Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus “C protein” binds to membrane docking protein, while “X protein” and “P protein” interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION: The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV), interacting partners of host protein were identified using optimised SVM model. Public Library of Science 2014-11-06 /pmc/articles/PMC4223108/ /pubmed/25375323 http://dx.doi.org/10.1371/journal.pone.0112034 Text en © 2014 Barman et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Barman, Ranjan Kumar
Saha, Sudipto
Das, Santasabuj
Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods
title Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods
title_full Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods
title_fullStr Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods
title_full_unstemmed Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods
title_short Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods
title_sort prediction of interactions between viral and host proteins using supervised machine learning methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4223108/
https://www.ncbi.nlm.nih.gov/pubmed/25375323
http://dx.doi.org/10.1371/journal.pone.0112034
work_keys_str_mv AT barmanranjankumar predictionofinteractionsbetweenviralandhostproteinsusingsupervisedmachinelearningmethods
AT sahasudipto predictionofinteractionsbetweenviralandhostproteinsusingsupervisedmachinelearningmethods
AT dassantasabuj predictionofinteractionsbetweenviralandhostproteinsusingsupervisedmachinelearningmethods