Cargando…
Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques
Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that’s why in recent yea...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10399861/ https://www.ncbi.nlm.nih.gov/pubmed/37535616 http://dx.doi.org/10.1371/journal.pone.0288173 |
_version_ | 1785084339851100160 |
---|---|
author | Khojasteh, Hakimeh Pirgazi, Jamshid Ghanbari Sorkhi, Ali |
author_facet | Khojasteh, Hakimeh Pirgazi, Jamshid Ghanbari Sorkhi, Ali |
author_sort | Khojasteh, Hakimeh |
collection | PubMed |
description | Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that’s why in recent years, the use of computational methods based on machine learning has attracted the attention of many researchers. Actually, a dry lab environment focusing more on computational methods of interaction prediction can be helpful in limiting search space for wet lab experiments. In this paper, a novel multi-stage approach for DTI is proposed that called SRX-DTI. In the first stage, combination of various descriptors from protein sequences, and a FP2 fingerprint that is encoded from drug are extracted as feature vectors. A major challenge in this application is the imbalanced data due to the lack of known interactions, in this regard, in the second stage, the One-SVM-US technique is proposed to deal with this problem. Next, the FFS-RF algorithm, a forward feature selection algorithm, coupled with a random forest (RF) classifier is developed to maximize the predictive performance. This feature selection algorithm removes irrelevant features to obtain optimal features. Finally, balanced dataset with optimal features is given to the XGBoost classifier to identify DTIs. The experimental results demonstrate that our proposed approach SRX-DTI achieves higher performance than other existing methods in predicting DTIs. The datasets and source code are available at: https://github.com/Khojasteh-hb/SRX-DTI. |
format | Online Article Text |
id | pubmed-10399861 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-103998612023-08-04 Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques Khojasteh, Hakimeh Pirgazi, Jamshid Ghanbari Sorkhi, Ali PLoS One Research Article Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that’s why in recent years, the use of computational methods based on machine learning has attracted the attention of many researchers. Actually, a dry lab environment focusing more on computational methods of interaction prediction can be helpful in limiting search space for wet lab experiments. In this paper, a novel multi-stage approach for DTI is proposed that called SRX-DTI. In the first stage, combination of various descriptors from protein sequences, and a FP2 fingerprint that is encoded from drug are extracted as feature vectors. A major challenge in this application is the imbalanced data due to the lack of known interactions, in this regard, in the second stage, the One-SVM-US technique is proposed to deal with this problem. Next, the FFS-RF algorithm, a forward feature selection algorithm, coupled with a random forest (RF) classifier is developed to maximize the predictive performance. This feature selection algorithm removes irrelevant features to obtain optimal features. Finally, balanced dataset with optimal features is given to the XGBoost classifier to identify DTIs. The experimental results demonstrate that our proposed approach SRX-DTI achieves higher performance than other existing methods in predicting DTIs. The datasets and source code are available at: https://github.com/Khojasteh-hb/SRX-DTI. Public Library of Science 2023-08-03 /pmc/articles/PMC10399861/ /pubmed/37535616 http://dx.doi.org/10.1371/journal.pone.0288173 Text en © 2023 Khojasteh et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Khojasteh, Hakimeh Pirgazi, Jamshid Ghanbari Sorkhi, Ali Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques |
title | Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques |
title_full | Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques |
title_fullStr | Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques |
title_full_unstemmed | Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques |
title_short | Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques |
title_sort | improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10399861/ https://www.ncbi.nlm.nih.gov/pubmed/37535616 http://dx.doi.org/10.1371/journal.pone.0288173 |
work_keys_str_mv | AT khojastehhakimeh improvingpredictionofdrugtargetinteractionsbasedonfusingmultiplefeatureswithdatabalancingandfeatureselectiontechniques AT pirgazijamshid improvingpredictionofdrugtargetinteractionsbasedonfusingmultiplefeatureswithdatabalancingandfeatureselectiontechniques AT ghanbarisorkhiali improvingpredictionofdrugtargetinteractionsbasedonfusingmultiplefeatureswithdatabalancingandfeatureselectiontechniques |