Cargando…

HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach

MOTIVATION: Despite arduous and time-consuming experimental efforts, protein–protein interactions (PPIs) for many pathogenic microbes with their human host are still unknown, limiting our understanding of the intricate interactions during infection and the identification of therapeutic targets. Sinc...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahmatbakhsh, Matineh, Moutaoufik, Mohamed Taha, Gagarinova, Alla, Babu, Mohan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154073/
https://www.ncbi.nlm.nih.gov/pubmed/35669347
http://dx.doi.org/10.1093/bioadv/vbac038
_version_ 1784717963710955520
author Rahmatbakhsh, Matineh
Moutaoufik, Mohamed Taha
Gagarinova, Alla
Babu, Mohan
author_facet Rahmatbakhsh, Matineh
Moutaoufik, Mohamed Taha
Gagarinova, Alla
Babu, Mohan
author_sort Rahmatbakhsh, Matineh
collection PubMed
description MOTIVATION: Despite arduous and time-consuming experimental efforts, protein–protein interactions (PPIs) for many pathogenic microbes with their human host are still unknown, limiting our understanding of the intricate interactions during infection and the identification of therapeutic targets. Since computational tools offer a promising alternative, we developed an R/Bioconductor package, HPiP (Host–Pathogen Interaction Prediction) software with a series of amino acid sequence property descriptors and an ensemble machine learning classifiers to predict the yet unmapped interactions between pathogen and host proteins. RESULTS: Using severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) or the novel SARS-CoV-2 coronavirus-human PPI training sets as a case study, we show that HPiP achieves a good performance with PPI predictions between SARS-CoV-2 and human proteins, which we confirmed experimentally in human monocyte THP-1 cells, and with several quality control metrics. HPiP also exhibited strong performance in accurately predicting the previously reported PPIs when tested against the sequences of pathogenic bacteria, Mycobacterium tuberculosis and human proteins. Collectively, our fully documented HPiP software will hasten the exploration of PPIs for a systems-level understanding of many understudied pathogens and uncover molecular targets for repurposing existing drugs. AVAILABILITY AND IMPLEMENTATION: HPiP is released as an open-source code under the MIT license that is freely available on GitHub (https://github.com/BabuLab-UofR/HPiP) as well as on Bioconductor (http://bioconductor.org/packages/devel/bioc/html/HPiP.html). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9154073
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91540732022-06-04 HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach Rahmatbakhsh, Matineh Moutaoufik, Mohamed Taha Gagarinova, Alla Babu, Mohan Bioinform Adv Applications Note MOTIVATION: Despite arduous and time-consuming experimental efforts, protein–protein interactions (PPIs) for many pathogenic microbes with their human host are still unknown, limiting our understanding of the intricate interactions during infection and the identification of therapeutic targets. Since computational tools offer a promising alternative, we developed an R/Bioconductor package, HPiP (Host–Pathogen Interaction Prediction) software with a series of amino acid sequence property descriptors and an ensemble machine learning classifiers to predict the yet unmapped interactions between pathogen and host proteins. RESULTS: Using severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) or the novel SARS-CoV-2 coronavirus-human PPI training sets as a case study, we show that HPiP achieves a good performance with PPI predictions between SARS-CoV-2 and human proteins, which we confirmed experimentally in human monocyte THP-1 cells, and with several quality control metrics. HPiP also exhibited strong performance in accurately predicting the previously reported PPIs when tested against the sequences of pathogenic bacteria, Mycobacterium tuberculosis and human proteins. Collectively, our fully documented HPiP software will hasten the exploration of PPIs for a systems-level understanding of many understudied pathogens and uncover molecular targets for repurposing existing drugs. AVAILABILITY AND IMPLEMENTATION: HPiP is released as an open-source code under the MIT license that is freely available on GitHub (https://github.com/BabuLab-UofR/HPiP) as well as on Bioconductor (http://bioconductor.org/packages/devel/bioc/html/HPiP.html). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2022-05-23 /pmc/articles/PMC9154073/ /pubmed/35669347 http://dx.doi.org/10.1093/bioadv/vbac038 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Rahmatbakhsh, Matineh
Moutaoufik, Mohamed Taha
Gagarinova, Alla
Babu, Mohan
HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach
title HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach
title_full HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach
title_fullStr HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach
title_full_unstemmed HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach
title_short HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach
title_sort hpip: an r/bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154073/
https://www.ncbi.nlm.nih.gov/pubmed/35669347
http://dx.doi.org/10.1093/bioadv/vbac038
work_keys_str_mv AT rahmatbakhshmatineh hpipanrbioconductorpackageforpredictinghostpathogenproteinproteininteractionsfromproteinsequencesusingensemblemachinelearningapproach
AT moutaoufikmohamedtaha hpipanrbioconductorpackageforpredictinghostpathogenproteinproteininteractionsfromproteinsequencesusingensemblemachinelearningapproach
AT gagarinovaalla hpipanrbioconductorpackageforpredictinghostpathogenproteinproteininteractionsfromproteinsequencesusingensemblemachinelearningapproach
AT babumohan hpipanrbioconductorpackageforpredictinghostpathogenproteinproteininteractionsfromproteinsequencesusingensemblemachinelearningapproach