Cargando…
HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach
MOTIVATION: Despite arduous and time-consuming experimental efforts, protein–protein interactions (PPIs) for many pathogenic microbes with their human host are still unknown, limiting our understanding of the intricate interactions during infection and the identification of therapeutic targets. Sinc...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154073/ https://www.ncbi.nlm.nih.gov/pubmed/35669347 http://dx.doi.org/10.1093/bioadv/vbac038 |
_version_ | 1784717963710955520 |
---|---|
author | Rahmatbakhsh, Matineh Moutaoufik, Mohamed Taha Gagarinova, Alla Babu, Mohan |
author_facet | Rahmatbakhsh, Matineh Moutaoufik, Mohamed Taha Gagarinova, Alla Babu, Mohan |
author_sort | Rahmatbakhsh, Matineh |
collection | PubMed |
description | MOTIVATION: Despite arduous and time-consuming experimental efforts, protein–protein interactions (PPIs) for many pathogenic microbes with their human host are still unknown, limiting our understanding of the intricate interactions during infection and the identification of therapeutic targets. Since computational tools offer a promising alternative, we developed an R/Bioconductor package, HPiP (Host–Pathogen Interaction Prediction) software with a series of amino acid sequence property descriptors and an ensemble machine learning classifiers to predict the yet unmapped interactions between pathogen and host proteins. RESULTS: Using severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) or the novel SARS-CoV-2 coronavirus-human PPI training sets as a case study, we show that HPiP achieves a good performance with PPI predictions between SARS-CoV-2 and human proteins, which we confirmed experimentally in human monocyte THP-1 cells, and with several quality control metrics. HPiP also exhibited strong performance in accurately predicting the previously reported PPIs when tested against the sequences of pathogenic bacteria, Mycobacterium tuberculosis and human proteins. Collectively, our fully documented HPiP software will hasten the exploration of PPIs for a systems-level understanding of many understudied pathogens and uncover molecular targets for repurposing existing drugs. AVAILABILITY AND IMPLEMENTATION: HPiP is released as an open-source code under the MIT license that is freely available on GitHub (https://github.com/BabuLab-UofR/HPiP) as well as on Bioconductor (http://bioconductor.org/packages/devel/bioc/html/HPiP.html). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. |
format | Online Article Text |
id | pubmed-9154073 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-91540732022-06-04 HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach Rahmatbakhsh, Matineh Moutaoufik, Mohamed Taha Gagarinova, Alla Babu, Mohan Bioinform Adv Applications Note MOTIVATION: Despite arduous and time-consuming experimental efforts, protein–protein interactions (PPIs) for many pathogenic microbes with their human host are still unknown, limiting our understanding of the intricate interactions during infection and the identification of therapeutic targets. Since computational tools offer a promising alternative, we developed an R/Bioconductor package, HPiP (Host–Pathogen Interaction Prediction) software with a series of amino acid sequence property descriptors and an ensemble machine learning classifiers to predict the yet unmapped interactions between pathogen and host proteins. RESULTS: Using severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) or the novel SARS-CoV-2 coronavirus-human PPI training sets as a case study, we show that HPiP achieves a good performance with PPI predictions between SARS-CoV-2 and human proteins, which we confirmed experimentally in human monocyte THP-1 cells, and with several quality control metrics. HPiP also exhibited strong performance in accurately predicting the previously reported PPIs when tested against the sequences of pathogenic bacteria, Mycobacterium tuberculosis and human proteins. Collectively, our fully documented HPiP software will hasten the exploration of PPIs for a systems-level understanding of many understudied pathogens and uncover molecular targets for repurposing existing drugs. AVAILABILITY AND IMPLEMENTATION: HPiP is released as an open-source code under the MIT license that is freely available on GitHub (https://github.com/BabuLab-UofR/HPiP) as well as on Bioconductor (http://bioconductor.org/packages/devel/bioc/html/HPiP.html). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2022-05-23 /pmc/articles/PMC9154073/ /pubmed/35669347 http://dx.doi.org/10.1093/bioadv/vbac038 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Rahmatbakhsh, Matineh Moutaoufik, Mohamed Taha Gagarinova, Alla Babu, Mohan HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach |
title | HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach |
title_full | HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach |
title_fullStr | HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach |
title_full_unstemmed | HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach |
title_short | HPiP: an R/Bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach |
title_sort | hpip: an r/bioconductor package for predicting host–pathogen protein–protein interactions from protein sequences using ensemble machine learning approach |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154073/ https://www.ncbi.nlm.nih.gov/pubmed/35669347 http://dx.doi.org/10.1093/bioadv/vbac038 |
work_keys_str_mv | AT rahmatbakhshmatineh hpipanrbioconductorpackageforpredictinghostpathogenproteinproteininteractionsfromproteinsequencesusingensemblemachinelearningapproach AT moutaoufikmohamedtaha hpipanrbioconductorpackageforpredictinghostpathogenproteinproteininteractionsfromproteinsequencesusingensemblemachinelearningapproach AT gagarinovaalla hpipanrbioconductorpackageforpredictinghostpathogenproteinproteininteractionsfromproteinsequencesusingensemblemachinelearningapproach AT babumohan hpipanrbioconductorpackageforpredictinghostpathogenproteinproteininteractionsfromproteinsequencesusingensemblemachinelearningapproach |