Cargando…
An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning
In the process of drug discovery, drug-induced liver injury (DILI) is still an active research field and is one of the most common and important issues in toxicity evaluation research. It directly leads to the high wear attrition of the drug. At present, there are a variety of computer algorithms ba...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9147181/ https://www.ncbi.nlm.nih.gov/pubmed/35630587 http://dx.doi.org/10.3390/molecules27103112 |
_version_ | 1784716744920662016 |
---|---|
author | Yan, Bowei Ye, Xiaona Wang, Jing Han, Junshan Wu, Lianlian He, Song Liu, Kunhong Bo, Xiaochen |
author_facet | Yan, Bowei Ye, Xiaona Wang, Jing Han, Junshan Wu, Lianlian He, Song Liu, Kunhong Bo, Xiaochen |
author_sort | Yan, Bowei |
collection | PubMed |
description | In the process of drug discovery, drug-induced liver injury (DILI) is still an active research field and is one of the most common and important issues in toxicity evaluation research. It directly leads to the high wear attrition of the drug. At present, there are a variety of computer algorithms based on molecular representations to predict DILI. It is found that a single molecular representation method is insufficient to complete the task of toxicity prediction, and multiple molecular fingerprint fusion methods have been used as model input. In order to solve the problem of high dimensional and unbalanced DILI prediction data, this paper integrates existing datasets and designs a new algorithm framework, Rotation-Ensemble-GA (R-E-GA). The main idea is to find a feature subset with better predictive performance after rotating the fusion vector of high-dimensional molecular representation in the feature space. Then, an Adaboost-type ensemble learning method is integrated into R-E-GA to improve the prediction accuracy. The experimental results show that the performance of R-E-GA is better than other state-of-art algorithms including ensemble learning-based and graph neural network-based methods. Through five-fold cross-validation, the R-E-GA obtains an ACC of 0.77, an F1 score of 0.769, and an AUC of 0.842. |
format | Online Article Text |
id | pubmed-9147181 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-91471812022-05-29 An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning Yan, Bowei Ye, Xiaona Wang, Jing Han, Junshan Wu, Lianlian He, Song Liu, Kunhong Bo, Xiaochen Molecules Article In the process of drug discovery, drug-induced liver injury (DILI) is still an active research field and is one of the most common and important issues in toxicity evaluation research. It directly leads to the high wear attrition of the drug. At present, there are a variety of computer algorithms based on molecular representations to predict DILI. It is found that a single molecular representation method is insufficient to complete the task of toxicity prediction, and multiple molecular fingerprint fusion methods have been used as model input. In order to solve the problem of high dimensional and unbalanced DILI prediction data, this paper integrates existing datasets and designs a new algorithm framework, Rotation-Ensemble-GA (R-E-GA). The main idea is to find a feature subset with better predictive performance after rotating the fusion vector of high-dimensional molecular representation in the feature space. Then, an Adaboost-type ensemble learning method is integrated into R-E-GA to improve the prediction accuracy. The experimental results show that the performance of R-E-GA is better than other state-of-art algorithms including ensemble learning-based and graph neural network-based methods. Through five-fold cross-validation, the R-E-GA obtains an ACC of 0.77, an F1 score of 0.769, and an AUC of 0.842. MDPI 2022-05-12 /pmc/articles/PMC9147181/ /pubmed/35630587 http://dx.doi.org/10.3390/molecules27103112 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Yan, Bowei Ye, Xiaona Wang, Jing Han, Junshan Wu, Lianlian He, Song Liu, Kunhong Bo, Xiaochen An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning |
title | An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning |
title_full | An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning |
title_fullStr | An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning |
title_full_unstemmed | An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning |
title_short | An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning |
title_sort | algorithm framework for drug-induced liver injury prediction based on genetic algorithm and ensemble learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9147181/ https://www.ncbi.nlm.nih.gov/pubmed/35630587 http://dx.doi.org/10.3390/molecules27103112 |
work_keys_str_mv | AT yanbowei analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT yexiaona analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT wangjing analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT hanjunshan analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT wulianlian analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT hesong analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT liukunhong analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT boxiaochen analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT yanbowei algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT yexiaona algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT wangjing algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT hanjunshan algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT wulianlian algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT hesong algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT liukunhong algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning AT boxiaochen algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning |