Cargando…

An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing

BACKGROUND: Recently, drug repositioning has received considerable attention for its advantage to pharmaceutical industries in drug development. Artificial intelligence techniques have greatly enhanced drug reproduction by discovering therapeutic drug profiles, side effects, and new target proteins....

Descripción completa

Detalles Bibliográficos
Autores principales: El-Behery, Heba, Attia, Abdel-Fattah, El-Fishawy, Nawal, Torkey, Hanaa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361677/
https://www.ncbi.nlm.nih.gov/pubmed/35941686
http://dx.doi.org/10.1186/s13036-022-00296-7
_version_ 1784764577519501312
author El-Behery, Heba
Attia, Abdel-Fattah
El-Fishawy, Nawal
Torkey, Hanaa
author_facet El-Behery, Heba
Attia, Abdel-Fattah
El-Fishawy, Nawal
Torkey, Hanaa
author_sort El-Behery, Heba
collection PubMed
description BACKGROUND: Recently, drug repositioning has received considerable attention for its advantage to pharmaceutical industries in drug development. Artificial intelligence techniques have greatly enhanced drug reproduction by discovering therapeutic drug profiles, side effects, and new target proteins. However, as the number of drugs increases, their targets and enormous interactions produce imbalanced data that might not be preferable as an input to a prediction model immediately. METHODS: This paper proposes a novel scheme for predicting drug–target interactions (DTIs) based on drug chemical structures and protein sequences. The drug Morgan fingerprint, drug constitutional descriptors, protein amino acid composition, and protein dipeptide composition were employed to extract the drugs and protein’s characteristics. Then, the proposed approach for extracting negative samples using a support vector machine one-class classifier was developed to tackle the imbalanced data problem feature sets from the drug–target dataset. Negative and positive samplings were constructed and fed into different prediction algorithms to identify DTIs. A 10-fold CV validation test procedure was applied to assess the predictability of the proposed method, in addition to the study of the effectiveness of the chemical and physical features in the evaluation and discovery of the drug–target interactions. RESULTS: Our experimental model outperformed existing techniques concerning the curve for receiver operating characteristic (AUC), accuracy, precision, recall F-score, mean square error, and MCC. The results obtained by the AdaBoost classifier enhanced prediction accuracy by 2.74%, precision by 1.98%, AUC by 1.14%, F-score by 3.53%, and MCC by 4.54% over existing methods.
format Online
Article
Text
id pubmed-9361677
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-93616772022-08-10 An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing El-Behery, Heba Attia, Abdel-Fattah El-Fishawy, Nawal Torkey, Hanaa J Biol Eng Research BACKGROUND: Recently, drug repositioning has received considerable attention for its advantage to pharmaceutical industries in drug development. Artificial intelligence techniques have greatly enhanced drug reproduction by discovering therapeutic drug profiles, side effects, and new target proteins. However, as the number of drugs increases, their targets and enormous interactions produce imbalanced data that might not be preferable as an input to a prediction model immediately. METHODS: This paper proposes a novel scheme for predicting drug–target interactions (DTIs) based on drug chemical structures and protein sequences. The drug Morgan fingerprint, drug constitutional descriptors, protein amino acid composition, and protein dipeptide composition were employed to extract the drugs and protein’s characteristics. Then, the proposed approach for extracting negative samples using a support vector machine one-class classifier was developed to tackle the imbalanced data problem feature sets from the drug–target dataset. Negative and positive samplings were constructed and fed into different prediction algorithms to identify DTIs. A 10-fold CV validation test procedure was applied to assess the predictability of the proposed method, in addition to the study of the effectiveness of the chemical and physical features in the evaluation and discovery of the drug–target interactions. RESULTS: Our experimental model outperformed existing techniques concerning the curve for receiver operating characteristic (AUC), accuracy, precision, recall F-score, mean square error, and MCC. The results obtained by the AdaBoost classifier enhanced prediction accuracy by 2.74%, precision by 1.98%, AUC by 1.14%, F-score by 3.53%, and MCC by 4.54% over existing methods. BioMed Central 2022-08-08 /pmc/articles/PMC9361677/ /pubmed/35941686 http://dx.doi.org/10.1186/s13036-022-00296-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
El-Behery, Heba
Attia, Abdel-Fattah
El-Fishawy, Nawal
Torkey, Hanaa
An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing
title An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing
title_full An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing
title_fullStr An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing
title_full_unstemmed An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing
title_short An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing
title_sort ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361677/
https://www.ncbi.nlm.nih.gov/pubmed/35941686
http://dx.doi.org/10.1186/s13036-022-00296-7
work_keys_str_mv AT elbeheryheba anensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing
AT attiaabdelfattah anensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing
AT elfishawynawal anensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing
AT torkeyhanaa anensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing
AT elbeheryheba ensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing
AT attiaabdelfattah ensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing
AT elfishawynawal ensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing
AT torkeyhanaa ensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing