Cargando…
Integration of human cell lines gene expression and chemical properties of drugs for Drug Induced Liver Injury prediction
MOTIVATION: Drug-induced liver injury (DILI) is one of the primary problems in drug development. Early prediction of DILI can bring a significant reduction in the cost of clinical trials. In this work we examined whether occurrence of DILI can be predicted using gene expression profile in cancer cel...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7796564/ https://www.ncbi.nlm.nih.gov/pubmed/33422118 http://dx.doi.org/10.1186/s13062-020-00286-z |
Sumario: | MOTIVATION: Drug-induced liver injury (DILI) is one of the primary problems in drug development. Early prediction of DILI can bring a significant reduction in the cost of clinical trials. In this work we examined whether occurrence of DILI can be predicted using gene expression profile in cancer cell lines and chemical properties of drugs. METHODS: We used gene expression profiles from 13 human cell lines, as well as molecular properties of drugs to build Machine Learning models of DILI. To this end, we have used a robust cross-validated protocol based on feature selection and Random Forest algorithm. In this protocol we first identify the most informative variables and then use them to build predictive models. The models are first built using data from single cell lines, and chemical properties. Then they are integrated using Super Learner method with several underlying methods for integration. The entire modelling process is performed using nested cross-validation. RESULTS: We have obtained weakly predictive ML models when using either molecular descriptors, or some individual cell lines (AUC ∈(0.55−0.61)). Models obtained with the Super Learner approach have a significantly improved accuracy (AUC=0.73), which allows to divide substances in two categories: low-risk and high-risk. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s13062-020-00286-z). |
---|