Cargando…

Heterogeneity Aware Random Forest for Drug Sensitivity Prediction

Samples collected in pharmacogenomics databases typically belong to various cancer types. For designing a drug sensitivity predictive model from such a database, a natural question arises whether a model trained on diverse inter-tumor heterogeneous samples will perform similar to a predictive model...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahman, Raziur, Matlock, Kevin, Ghosh, Souparno, Pal, Ranadip
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5595802/
https://www.ncbi.nlm.nih.gov/pubmed/28900181
http://dx.doi.org/10.1038/s41598-017-11665-4
_version_ 1783263419934179328
author Rahman, Raziur
Matlock, Kevin
Ghosh, Souparno
Pal, Ranadip
author_facet Rahman, Raziur
Matlock, Kevin
Ghosh, Souparno
Pal, Ranadip
author_sort Rahman, Raziur
collection PubMed
description Samples collected in pharmacogenomics databases typically belong to various cancer types. For designing a drug sensitivity predictive model from such a database, a natural question arises whether a model trained on diverse inter-tumor heterogeneous samples will perform similar to a predictive model that takes into consideration the heterogeneity of the samples in model training and prediction. We explore this hypothesis and observe that ensemble model predictions obtained when cancer type is known out-perform predictions when that information is withheld even when the samples sizes for the former is considerably lower than the combined sample size. To incorporate the heterogeneity idea in the commonly used ensemble based predictive model of Random Forests, we propose Heterogeneity Aware Random Forests (HARF) that assigns weights to the trees based on the category of the sample. We treat heterogeneity as a latent class allocation problem and present a covariate free class allocation approach based on the distribution of leaf nodes of the model ensemble. Applications on CCLE and GDSC databases show that HARF outperforms traditional Random Forest when the average drug responses of cancer types are different.
format Online
Article
Text
id pubmed-5595802
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-55958022017-09-14 Heterogeneity Aware Random Forest for Drug Sensitivity Prediction Rahman, Raziur Matlock, Kevin Ghosh, Souparno Pal, Ranadip Sci Rep Article Samples collected in pharmacogenomics databases typically belong to various cancer types. For designing a drug sensitivity predictive model from such a database, a natural question arises whether a model trained on diverse inter-tumor heterogeneous samples will perform similar to a predictive model that takes into consideration the heterogeneity of the samples in model training and prediction. We explore this hypothesis and observe that ensemble model predictions obtained when cancer type is known out-perform predictions when that information is withheld even when the samples sizes for the former is considerably lower than the combined sample size. To incorporate the heterogeneity idea in the commonly used ensemble based predictive model of Random Forests, we propose Heterogeneity Aware Random Forests (HARF) that assigns weights to the trees based on the category of the sample. We treat heterogeneity as a latent class allocation problem and present a covariate free class allocation approach based on the distribution of leaf nodes of the model ensemble. Applications on CCLE and GDSC databases show that HARF outperforms traditional Random Forest when the average drug responses of cancer types are different. Nature Publishing Group UK 2017-09-12 /pmc/articles/PMC5595802/ /pubmed/28900181 http://dx.doi.org/10.1038/s41598-017-11665-4 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Rahman, Raziur
Matlock, Kevin
Ghosh, Souparno
Pal, Ranadip
Heterogeneity Aware Random Forest for Drug Sensitivity Prediction
title Heterogeneity Aware Random Forest for Drug Sensitivity Prediction
title_full Heterogeneity Aware Random Forest for Drug Sensitivity Prediction
title_fullStr Heterogeneity Aware Random Forest for Drug Sensitivity Prediction
title_full_unstemmed Heterogeneity Aware Random Forest for Drug Sensitivity Prediction
title_short Heterogeneity Aware Random Forest for Drug Sensitivity Prediction
title_sort heterogeneity aware random forest for drug sensitivity prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5595802/
https://www.ncbi.nlm.nih.gov/pubmed/28900181
http://dx.doi.org/10.1038/s41598-017-11665-4
work_keys_str_mv AT rahmanraziur heterogeneityawarerandomforestfordrugsensitivityprediction
AT matlockkevin heterogeneityawarerandomforestfordrugsensitivityprediction
AT ghoshsouparno heterogeneityawarerandomforestfordrugsensitivityprediction
AT palranadip heterogeneityawarerandomforestfordrugsensitivityprediction