Cargando…

Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques

Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that’s why in recent yea...

Descripción completa

Detalles Bibliográficos
Autores principales: Khojasteh, Hakimeh, Pirgazi, Jamshid, Ghanbari Sorkhi, Ali
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10399861/
https://www.ncbi.nlm.nih.gov/pubmed/37535616
http://dx.doi.org/10.1371/journal.pone.0288173
_version_ 1785084339851100160
author Khojasteh, Hakimeh
Pirgazi, Jamshid
Ghanbari Sorkhi, Ali
author_facet Khojasteh, Hakimeh
Pirgazi, Jamshid
Ghanbari Sorkhi, Ali
author_sort Khojasteh, Hakimeh
collection PubMed
description Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that’s why in recent years, the use of computational methods based on machine learning has attracted the attention of many researchers. Actually, a dry lab environment focusing more on computational methods of interaction prediction can be helpful in limiting search space for wet lab experiments. In this paper, a novel multi-stage approach for DTI is proposed that called SRX-DTI. In the first stage, combination of various descriptors from protein sequences, and a FP2 fingerprint that is encoded from drug are extracted as feature vectors. A major challenge in this application is the imbalanced data due to the lack of known interactions, in this regard, in the second stage, the One-SVM-US technique is proposed to deal with this problem. Next, the FFS-RF algorithm, a forward feature selection algorithm, coupled with a random forest (RF) classifier is developed to maximize the predictive performance. This feature selection algorithm removes irrelevant features to obtain optimal features. Finally, balanced dataset with optimal features is given to the XGBoost classifier to identify DTIs. The experimental results demonstrate that our proposed approach SRX-DTI achieves higher performance than other existing methods in predicting DTIs. The datasets and source code are available at: https://github.com/Khojasteh-hb/SRX-DTI.
format Online
Article
Text
id pubmed-10399861
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-103998612023-08-04 Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques Khojasteh, Hakimeh Pirgazi, Jamshid Ghanbari Sorkhi, Ali PLoS One Research Article Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that’s why in recent years, the use of computational methods based on machine learning has attracted the attention of many researchers. Actually, a dry lab environment focusing more on computational methods of interaction prediction can be helpful in limiting search space for wet lab experiments. In this paper, a novel multi-stage approach for DTI is proposed that called SRX-DTI. In the first stage, combination of various descriptors from protein sequences, and a FP2 fingerprint that is encoded from drug are extracted as feature vectors. A major challenge in this application is the imbalanced data due to the lack of known interactions, in this regard, in the second stage, the One-SVM-US technique is proposed to deal with this problem. Next, the FFS-RF algorithm, a forward feature selection algorithm, coupled with a random forest (RF) classifier is developed to maximize the predictive performance. This feature selection algorithm removes irrelevant features to obtain optimal features. Finally, balanced dataset with optimal features is given to the XGBoost classifier to identify DTIs. The experimental results demonstrate that our proposed approach SRX-DTI achieves higher performance than other existing methods in predicting DTIs. The datasets and source code are available at: https://github.com/Khojasteh-hb/SRX-DTI. Public Library of Science 2023-08-03 /pmc/articles/PMC10399861/ /pubmed/37535616 http://dx.doi.org/10.1371/journal.pone.0288173 Text en © 2023 Khojasteh et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Khojasteh, Hakimeh
Pirgazi, Jamshid
Ghanbari Sorkhi, Ali
Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques
title Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques
title_full Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques
title_fullStr Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques
title_full_unstemmed Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques
title_short Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques
title_sort improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10399861/
https://www.ncbi.nlm.nih.gov/pubmed/37535616
http://dx.doi.org/10.1371/journal.pone.0288173
work_keys_str_mv AT khojastehhakimeh improvingpredictionofdrugtargetinteractionsbasedonfusingmultiplefeatureswithdatabalancingandfeatureselectiontechniques
AT pirgazijamshid improvingpredictionofdrugtargetinteractionsbasedonfusingmultiplefeatureswithdatabalancingandfeatureselectiontechniques
AT ghanbarisorkhiali improvingpredictionofdrugtargetinteractionsbasedonfusingmultiplefeatureswithdatabalancingandfeatureselectiontechniques