Cargando…
An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing
BACKGROUND: Recently, drug repositioning has received considerable attention for its advantage to pharmaceutical industries in drug development. Artificial intelligence techniques have greatly enhanced drug reproduction by discovering therapeutic drug profiles, side effects, and new target proteins....
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361677/ https://www.ncbi.nlm.nih.gov/pubmed/35941686 http://dx.doi.org/10.1186/s13036-022-00296-7 |
_version_ | 1784764577519501312 |
---|---|
author | El-Behery, Heba Attia, Abdel-Fattah El-Fishawy, Nawal Torkey, Hanaa |
author_facet | El-Behery, Heba Attia, Abdel-Fattah El-Fishawy, Nawal Torkey, Hanaa |
author_sort | El-Behery, Heba |
collection | PubMed |
description | BACKGROUND: Recently, drug repositioning has received considerable attention for its advantage to pharmaceutical industries in drug development. Artificial intelligence techniques have greatly enhanced drug reproduction by discovering therapeutic drug profiles, side effects, and new target proteins. However, as the number of drugs increases, their targets and enormous interactions produce imbalanced data that might not be preferable as an input to a prediction model immediately. METHODS: This paper proposes a novel scheme for predicting drug–target interactions (DTIs) based on drug chemical structures and protein sequences. The drug Morgan fingerprint, drug constitutional descriptors, protein amino acid composition, and protein dipeptide composition were employed to extract the drugs and protein’s characteristics. Then, the proposed approach for extracting negative samples using a support vector machine one-class classifier was developed to tackle the imbalanced data problem feature sets from the drug–target dataset. Negative and positive samplings were constructed and fed into different prediction algorithms to identify DTIs. A 10-fold CV validation test procedure was applied to assess the predictability of the proposed method, in addition to the study of the effectiveness of the chemical and physical features in the evaluation and discovery of the drug–target interactions. RESULTS: Our experimental model outperformed existing techniques concerning the curve for receiver operating characteristic (AUC), accuracy, precision, recall F-score, mean square error, and MCC. The results obtained by the AdaBoost classifier enhanced prediction accuracy by 2.74%, precision by 1.98%, AUC by 1.14%, F-score by 3.53%, and MCC by 4.54% over existing methods. |
format | Online Article Text |
id | pubmed-9361677 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-93616772022-08-10 An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing El-Behery, Heba Attia, Abdel-Fattah El-Fishawy, Nawal Torkey, Hanaa J Biol Eng Research BACKGROUND: Recently, drug repositioning has received considerable attention for its advantage to pharmaceutical industries in drug development. Artificial intelligence techniques have greatly enhanced drug reproduction by discovering therapeutic drug profiles, side effects, and new target proteins. However, as the number of drugs increases, their targets and enormous interactions produce imbalanced data that might not be preferable as an input to a prediction model immediately. METHODS: This paper proposes a novel scheme for predicting drug–target interactions (DTIs) based on drug chemical structures and protein sequences. The drug Morgan fingerprint, drug constitutional descriptors, protein amino acid composition, and protein dipeptide composition were employed to extract the drugs and protein’s characteristics. Then, the proposed approach for extracting negative samples using a support vector machine one-class classifier was developed to tackle the imbalanced data problem feature sets from the drug–target dataset. Negative and positive samplings were constructed and fed into different prediction algorithms to identify DTIs. A 10-fold CV validation test procedure was applied to assess the predictability of the proposed method, in addition to the study of the effectiveness of the chemical and physical features in the evaluation and discovery of the drug–target interactions. RESULTS: Our experimental model outperformed existing techniques concerning the curve for receiver operating characteristic (AUC), accuracy, precision, recall F-score, mean square error, and MCC. The results obtained by the AdaBoost classifier enhanced prediction accuracy by 2.74%, precision by 1.98%, AUC by 1.14%, F-score by 3.53%, and MCC by 4.54% over existing methods. BioMed Central 2022-08-08 /pmc/articles/PMC9361677/ /pubmed/35941686 http://dx.doi.org/10.1186/s13036-022-00296-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research El-Behery, Heba Attia, Abdel-Fattah El-Fishawy, Nawal Torkey, Hanaa An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing |
title | An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing |
title_full | An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing |
title_fullStr | An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing |
title_full_unstemmed | An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing |
title_short | An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing |
title_sort | ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361677/ https://www.ncbi.nlm.nih.gov/pubmed/35941686 http://dx.doi.org/10.1186/s13036-022-00296-7 |
work_keys_str_mv | AT elbeheryheba anensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing AT attiaabdelfattah anensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing AT elfishawynawal anensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing AT torkeyhanaa anensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing AT elbeheryheba ensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing AT attiaabdelfattah ensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing AT elfishawynawal ensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing AT torkeyhanaa ensemblebaseddrugtargetinteractionpredictionapproachusingmultiplefeatureinformationwithdatabalancing |