Cargando…
Predicting drug-induced liver injury: The importance of data curation
Drug-induced liver injury (DILI) is a major issue for both patients and pharmaceutical industry due to insufficient means of prevention/prediction. In the current work we present a 2-class classification model for DILI, generated with Random Forest and 2D molecular descriptors on a dataset of 966 co...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6422282/ https://www.ncbi.nlm.nih.gov/pubmed/28652195 http://dx.doi.org/10.1016/j.tox.2017.06.003 |
_version_ | 1783404367242592256 |
---|---|
author | Kotsampasakou, Eleni Montanari, Floriane Ecker, Gerhard F. |
author_facet | Kotsampasakou, Eleni Montanari, Floriane Ecker, Gerhard F. |
author_sort | Kotsampasakou, Eleni |
collection | PubMed |
description | Drug-induced liver injury (DILI) is a major issue for both patients and pharmaceutical industry due to insufficient means of prevention/prediction. In the current work we present a 2-class classification model for DILI, generated with Random Forest and 2D molecular descriptors on a dataset of 966 compounds. In addition, predicted transporter inhibition profiles were also included into the models. The initially compiled dataset of 1773 compounds was reduced via a 2-step approach to 966 compounds, resulting in a significant increase (p-value < 0.05) in model performance. The models have been validated via 10-fold cross-validation and against three external test sets of 921, 341 and 96 compounds, respectively. The final model showed an accuracy of 64% (AUC 68%) for 10-fold cross-validation (average of 50 iterations) and comparable values for two test sets (AUC 59%, 71% and 66%, respectively). In the study we also examined whether the predictions of our in-house transporter inhibition models for BSEP, BCRP, P-glycoprotein, and OATP1B1 and 1B3 contributed in improvement of the DILI mode. Finally, the model was implemented with open-source 2D RDKit descriptors in order to be provided to the community as a Python script. |
format | Online Article Text |
id | pubmed-6422282 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
record_format | MEDLINE/PubMed |
spelling | pubmed-64222822019-03-18 Predicting drug-induced liver injury: The importance of data curation Kotsampasakou, Eleni Montanari, Floriane Ecker, Gerhard F. Toxicology Article Drug-induced liver injury (DILI) is a major issue for both patients and pharmaceutical industry due to insufficient means of prevention/prediction. In the current work we present a 2-class classification model for DILI, generated with Random Forest and 2D molecular descriptors on a dataset of 966 compounds. In addition, predicted transporter inhibition profiles were also included into the models. The initially compiled dataset of 1773 compounds was reduced via a 2-step approach to 966 compounds, resulting in a significant increase (p-value < 0.05) in model performance. The models have been validated via 10-fold cross-validation and against three external test sets of 921, 341 and 96 compounds, respectively. The final model showed an accuracy of 64% (AUC 68%) for 10-fold cross-validation (average of 50 iterations) and comparable values for two test sets (AUC 59%, 71% and 66%, respectively). In the study we also examined whether the predictions of our in-house transporter inhibition models for BSEP, BCRP, P-glycoprotein, and OATP1B1 and 1B3 contributed in improvement of the DILI mode. Finally, the model was implemented with open-source 2D RDKit descriptors in order to be provided to the community as a Python script. 2017-06-23 2017-08-15 /pmc/articles/PMC6422282/ /pubmed/28652195 http://dx.doi.org/10.1016/j.tox.2017.06.003 Text en http://creativecommons.org/licenses/BY-NC-ND/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/BY-NC-ND/4.0/). |
spellingShingle | Article Kotsampasakou, Eleni Montanari, Floriane Ecker, Gerhard F. Predicting drug-induced liver injury: The importance of data curation |
title | Predicting drug-induced liver injury: The importance of data curation |
title_full | Predicting drug-induced liver injury: The importance of data curation |
title_fullStr | Predicting drug-induced liver injury: The importance of data curation |
title_full_unstemmed | Predicting drug-induced liver injury: The importance of data curation |
title_short | Predicting drug-induced liver injury: The importance of data curation |
title_sort | predicting drug-induced liver injury: the importance of data curation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6422282/ https://www.ncbi.nlm.nih.gov/pubmed/28652195 http://dx.doi.org/10.1016/j.tox.2017.06.003 |
work_keys_str_mv | AT kotsampasakoueleni predictingdruginducedliverinjurytheimportanceofdatacuration AT montanarifloriane predictingdruginducedliverinjurytheimportanceofdatacuration AT eckergerhardf predictingdruginducedliverinjurytheimportanceofdatacuration |