Cargando…
Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods
BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. METHODS: In this study...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4223108/ https://www.ncbi.nlm.nih.gov/pubmed/25375323 http://dx.doi.org/10.1371/journal.pone.0112034 |
_version_ | 1782343166344560640 |
---|---|
author | Barman, Ranjan Kumar Saha, Sudipto Das, Santasabuj |
author_facet | Barman, Ranjan Kumar Saha, Sudipto Das, Santasabuj |
author_sort | Barman, Ranjan Kumar |
collection | PubMed |
description | BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. METHODS: In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS: Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49%) and Random Forest (55.66%). However the specificity of Naïve Bayes was the highest (99.52%) as compared with SVM (74%) and Random Forest (89.08%). Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus “C protein” binds to membrane docking protein, while “X protein” and “P protein” interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION: The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV), interacting partners of host protein were identified using optimised SVM model. |
format | Online Article Text |
id | pubmed-4223108 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-42231082014-11-13 Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods Barman, Ranjan Kumar Saha, Sudipto Das, Santasabuj PLoS One Research Article BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. METHODS: In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS: Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49%) and Random Forest (55.66%). However the specificity of Naïve Bayes was the highest (99.52%) as compared with SVM (74%) and Random Forest (89.08%). Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus “C protein” binds to membrane docking protein, while “X protein” and “P protein” interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION: The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV), interacting partners of host protein were identified using optimised SVM model. Public Library of Science 2014-11-06 /pmc/articles/PMC4223108/ /pubmed/25375323 http://dx.doi.org/10.1371/journal.pone.0112034 Text en © 2014 Barman et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Barman, Ranjan Kumar Saha, Sudipto Das, Santasabuj Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods |
title | Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods |
title_full | Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods |
title_fullStr | Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods |
title_full_unstemmed | Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods |
title_short | Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods |
title_sort | prediction of interactions between viral and host proteins using supervised machine learning methods |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4223108/ https://www.ncbi.nlm.nih.gov/pubmed/25375323 http://dx.doi.org/10.1371/journal.pone.0112034 |
work_keys_str_mv | AT barmanranjankumar predictionofinteractionsbetweenviralandhostproteinsusingsupervisedmachinelearningmethods AT sahasudipto predictionofinteractionsbetweenviralandhostproteinsusingsupervisedmachinelearningmethods AT dassantasabuj predictionofinteractionsbetweenviralandhostproteinsusingsupervisedmachinelearningmethods |