Cargando…
BaPreS: a software tool for predicting bacteriocins using an optimal set of features
BACKGROUND: Antibiotic resistance is a major public health concern around the globe. As a result, researchers always look for new compounds to develop new antibiotic drugs for combating antibiotic-resistant bacteria. Bacteriocin becomes a promising antimicrobial agent to fight against antibiotic res...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10433575/ https://www.ncbi.nlm.nih.gov/pubmed/37592230 http://dx.doi.org/10.1186/s12859-023-05330-z |
_version_ | 1785091679029559296 |
---|---|
author | Akhter, Suraiya Miller, John H. |
author_facet | Akhter, Suraiya Miller, John H. |
author_sort | Akhter, Suraiya |
collection | PubMed |
description | BACKGROUND: Antibiotic resistance is a major public health concern around the globe. As a result, researchers always look for new compounds to develop new antibiotic drugs for combating antibiotic-resistant bacteria. Bacteriocin becomes a promising antimicrobial agent to fight against antibiotic resistance, due to cases of both broad and narrow killing spectra. Sequence matching methods are widely used to identify bacteriocins by comparing them with the known bacteriocin sequences; however, these methods often fail to detect new bacteriocin sequences due to their high diversity. The ability to use a machine learning approach can help find new highly dissimilar bacteriocins for developing highly effective antibiotic drugs. The aim of this work is to develop a machine learning-based software tool called BaPreS (Bacteriocin Prediction Software) using an optimal set of features for detecting bacteriocin protein sequences with high accuracy. We extracted potential features from known bacteriocin and non-bacteriocin sequences by considering the physicochemical and structural properties of the protein sequences. Then we reduced the feature set using statistical justifications and recursive feature elimination technique. Finally, we built support vector machine (SVM) and random forest (RF) models using the selected features and utilized the best machine learning model to implement the software tool. RESULTS: We applied BaPreS to an established dataset and evaluated its prediction performance. Acquired results show that the software tool can achieve a prediction accuracy of 95.54% for testing protein sequences. This tool allows users to add new bacteriocin or non-bacteriocin sequences in the training dataset to further enhance the predictive power of the tool. We compared the prediction performance of the BaPreS with a popular sequence matching-based tool and a deep learning-based method, and our software tool outperformed both. CONCLUSIONS: BaPreS is a bacteriocin prediction tool that can be used to discover new highly dissimilar bacteriocins for developing highly effective antibiotic drugs. This software tool can be used with Windows, Linux and macOS operating systems. The open-source software package and its user manual are available at https://github.com/suraiya14/BaPreS. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05330-z. |
format | Online Article Text |
id | pubmed-10433575 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-104335752023-08-18 BaPreS: a software tool for predicting bacteriocins using an optimal set of features Akhter, Suraiya Miller, John H. BMC Bioinformatics Software BACKGROUND: Antibiotic resistance is a major public health concern around the globe. As a result, researchers always look for new compounds to develop new antibiotic drugs for combating antibiotic-resistant bacteria. Bacteriocin becomes a promising antimicrobial agent to fight against antibiotic resistance, due to cases of both broad and narrow killing spectra. Sequence matching methods are widely used to identify bacteriocins by comparing them with the known bacteriocin sequences; however, these methods often fail to detect new bacteriocin sequences due to their high diversity. The ability to use a machine learning approach can help find new highly dissimilar bacteriocins for developing highly effective antibiotic drugs. The aim of this work is to develop a machine learning-based software tool called BaPreS (Bacteriocin Prediction Software) using an optimal set of features for detecting bacteriocin protein sequences with high accuracy. We extracted potential features from known bacteriocin and non-bacteriocin sequences by considering the physicochemical and structural properties of the protein sequences. Then we reduced the feature set using statistical justifications and recursive feature elimination technique. Finally, we built support vector machine (SVM) and random forest (RF) models using the selected features and utilized the best machine learning model to implement the software tool. RESULTS: We applied BaPreS to an established dataset and evaluated its prediction performance. Acquired results show that the software tool can achieve a prediction accuracy of 95.54% for testing protein sequences. This tool allows users to add new bacteriocin or non-bacteriocin sequences in the training dataset to further enhance the predictive power of the tool. We compared the prediction performance of the BaPreS with a popular sequence matching-based tool and a deep learning-based method, and our software tool outperformed both. CONCLUSIONS: BaPreS is a bacteriocin prediction tool that can be used to discover new highly dissimilar bacteriocins for developing highly effective antibiotic drugs. This software tool can be used with Windows, Linux and macOS operating systems. The open-source software package and its user manual are available at https://github.com/suraiya14/BaPreS. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05330-z. BioMed Central 2023-08-17 /pmc/articles/PMC10433575/ /pubmed/37592230 http://dx.doi.org/10.1186/s12859-023-05330-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Akhter, Suraiya Miller, John H. BaPreS: a software tool for predicting bacteriocins using an optimal set of features |
title | BaPreS: a software tool for predicting bacteriocins using an optimal set of features |
title_full | BaPreS: a software tool for predicting bacteriocins using an optimal set of features |
title_fullStr | BaPreS: a software tool for predicting bacteriocins using an optimal set of features |
title_full_unstemmed | BaPreS: a software tool for predicting bacteriocins using an optimal set of features |
title_short | BaPreS: a software tool for predicting bacteriocins using an optimal set of features |
title_sort | bapres: a software tool for predicting bacteriocins using an optimal set of features |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10433575/ https://www.ncbi.nlm.nih.gov/pubmed/37592230 http://dx.doi.org/10.1186/s12859-023-05330-z |
work_keys_str_mv | AT akhtersuraiya bapresasoftwaretoolforpredictingbacteriocinsusinganoptimalsetoffeatures AT millerjohnh bapresasoftwaretoolforpredictingbacteriocinsusinganoptimalsetoffeatures |