Cargando…

An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning

In the process of drug discovery, drug-induced liver injury (DILI) is still an active research field and is one of the most common and important issues in toxicity evaluation research. It directly leads to the high wear attrition of the drug. At present, there are a variety of computer algorithms ba...

Descripción completa

Detalles Bibliográficos
Autores principales: Yan, Bowei, Ye, Xiaona, Wang, Jing, Han, Junshan, Wu, Lianlian, He, Song, Liu, Kunhong, Bo, Xiaochen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9147181/
https://www.ncbi.nlm.nih.gov/pubmed/35630587
http://dx.doi.org/10.3390/molecules27103112
_version_ 1784716744920662016
author Yan, Bowei
Ye, Xiaona
Wang, Jing
Han, Junshan
Wu, Lianlian
He, Song
Liu, Kunhong
Bo, Xiaochen
author_facet Yan, Bowei
Ye, Xiaona
Wang, Jing
Han, Junshan
Wu, Lianlian
He, Song
Liu, Kunhong
Bo, Xiaochen
author_sort Yan, Bowei
collection PubMed
description In the process of drug discovery, drug-induced liver injury (DILI) is still an active research field and is one of the most common and important issues in toxicity evaluation research. It directly leads to the high wear attrition of the drug. At present, there are a variety of computer algorithms based on molecular representations to predict DILI. It is found that a single molecular representation method is insufficient to complete the task of toxicity prediction, and multiple molecular fingerprint fusion methods have been used as model input. In order to solve the problem of high dimensional and unbalanced DILI prediction data, this paper integrates existing datasets and designs a new algorithm framework, Rotation-Ensemble-GA (R-E-GA). The main idea is to find a feature subset with better predictive performance after rotating the fusion vector of high-dimensional molecular representation in the feature space. Then, an Adaboost-type ensemble learning method is integrated into R-E-GA to improve the prediction accuracy. The experimental results show that the performance of R-E-GA is better than other state-of-art algorithms including ensemble learning-based and graph neural network-based methods. Through five-fold cross-validation, the R-E-GA obtains an ACC of 0.77, an F1 score of 0.769, and an AUC of 0.842.
format Online
Article
Text
id pubmed-9147181
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-91471812022-05-29 An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning Yan, Bowei Ye, Xiaona Wang, Jing Han, Junshan Wu, Lianlian He, Song Liu, Kunhong Bo, Xiaochen Molecules Article In the process of drug discovery, drug-induced liver injury (DILI) is still an active research field and is one of the most common and important issues in toxicity evaluation research. It directly leads to the high wear attrition of the drug. At present, there are a variety of computer algorithms based on molecular representations to predict DILI. It is found that a single molecular representation method is insufficient to complete the task of toxicity prediction, and multiple molecular fingerprint fusion methods have been used as model input. In order to solve the problem of high dimensional and unbalanced DILI prediction data, this paper integrates existing datasets and designs a new algorithm framework, Rotation-Ensemble-GA (R-E-GA). The main idea is to find a feature subset with better predictive performance after rotating the fusion vector of high-dimensional molecular representation in the feature space. Then, an Adaboost-type ensemble learning method is integrated into R-E-GA to improve the prediction accuracy. The experimental results show that the performance of R-E-GA is better than other state-of-art algorithms including ensemble learning-based and graph neural network-based methods. Through five-fold cross-validation, the R-E-GA obtains an ACC of 0.77, an F1 score of 0.769, and an AUC of 0.842. MDPI 2022-05-12 /pmc/articles/PMC9147181/ /pubmed/35630587 http://dx.doi.org/10.3390/molecules27103112 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Yan, Bowei
Ye, Xiaona
Wang, Jing
Han, Junshan
Wu, Lianlian
He, Song
Liu, Kunhong
Bo, Xiaochen
An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning
title An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning
title_full An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning
title_fullStr An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning
title_full_unstemmed An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning
title_short An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning
title_sort algorithm framework for drug-induced liver injury prediction based on genetic algorithm and ensemble learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9147181/
https://www.ncbi.nlm.nih.gov/pubmed/35630587
http://dx.doi.org/10.3390/molecules27103112
work_keys_str_mv AT yanbowei analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT yexiaona analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT wangjing analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT hanjunshan analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT wulianlian analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT hesong analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT liukunhong analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT boxiaochen analgorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT yanbowei algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT yexiaona algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT wangjing algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT hanjunshan algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT wulianlian algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT hesong algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT liukunhong algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning
AT boxiaochen algorithmframeworkfordruginducedliverinjurypredictionbasedongeneticalgorithmandensemblelearning