Cargando…

Investigation of model stacking for drug sensitivity prediction

BACKGROUND: A significant problem in precision medicine is the prediction of drug sensitivity for individual cancer cell lines. Predictive models such as Random Forests have shown promising performance while predicting from individual genomic features such as gene expressions. However, accessibility...

Descripción completa

Detalles Bibliográficos
Autores principales: Matlock, Kevin, De Niz, Carlos, Rahman, Raziur, Ghosh, Souparno, Pal, Ranadip
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5872495/
https://www.ncbi.nlm.nih.gov/pubmed/29589559
http://dx.doi.org/10.1186/s12859-018-2060-2
_version_ 1783309848505483264
author Matlock, Kevin
De Niz, Carlos
Rahman, Raziur
Ghosh, Souparno
Pal, Ranadip
author_facet Matlock, Kevin
De Niz, Carlos
Rahman, Raziur
Ghosh, Souparno
Pal, Ranadip
author_sort Matlock, Kevin
collection PubMed
description BACKGROUND: A significant problem in precision medicine is the prediction of drug sensitivity for individual cancer cell lines. Predictive models such as Random Forests have shown promising performance while predicting from individual genomic features such as gene expressions. However, accessibility of various other forms of data types including information on multiple tested drugs necessitates the examination of designing predictive models incorporating the various data types. RESULTS: We explore the predictive performance of model stacking and the effect of stacking on the predictive bias and squared error. In addition we discuss the analytical underpinnings supporting the advantages of stacking in reducing squared error and inherent bias of random forests in prediction of outliers. The framework is tested on a setup including gene expression, drug target, physical properties and drug response information for a set of drugs and cell lines. CONCLUSION: The performance of individual and stacked models are compared. We note that stacking models built on two heterogeneous datasets provide superior performance to stacking different models built on the same dataset. It is also noted that stacking provides a noticeable reduction in the bias of our predictors when the dominant eigenvalue of the principle axis of variation in the residuals is significantly higher than the remaining eigenvalues.
format Online
Article
Text
id pubmed-5872495
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58724952018-04-02 Investigation of model stacking for drug sensitivity prediction Matlock, Kevin De Niz, Carlos Rahman, Raziur Ghosh, Souparno Pal, Ranadip BMC Bioinformatics Research BACKGROUND: A significant problem in precision medicine is the prediction of drug sensitivity for individual cancer cell lines. Predictive models such as Random Forests have shown promising performance while predicting from individual genomic features such as gene expressions. However, accessibility of various other forms of data types including information on multiple tested drugs necessitates the examination of designing predictive models incorporating the various data types. RESULTS: We explore the predictive performance of model stacking and the effect of stacking on the predictive bias and squared error. In addition we discuss the analytical underpinnings supporting the advantages of stacking in reducing squared error and inherent bias of random forests in prediction of outliers. The framework is tested on a setup including gene expression, drug target, physical properties and drug response information for a set of drugs and cell lines. CONCLUSION: The performance of individual and stacked models are compared. We note that stacking models built on two heterogeneous datasets provide superior performance to stacking different models built on the same dataset. It is also noted that stacking provides a noticeable reduction in the bias of our predictors when the dominant eigenvalue of the principle axis of variation in the residuals is significantly higher than the remaining eigenvalues. BioMed Central 2018-03-21 /pmc/articles/PMC5872495/ /pubmed/29589559 http://dx.doi.org/10.1186/s12859-018-2060-2 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Matlock, Kevin
De Niz, Carlos
Rahman, Raziur
Ghosh, Souparno
Pal, Ranadip
Investigation of model stacking for drug sensitivity prediction
title Investigation of model stacking for drug sensitivity prediction
title_full Investigation of model stacking for drug sensitivity prediction
title_fullStr Investigation of model stacking for drug sensitivity prediction
title_full_unstemmed Investigation of model stacking for drug sensitivity prediction
title_short Investigation of model stacking for drug sensitivity prediction
title_sort investigation of model stacking for drug sensitivity prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5872495/
https://www.ncbi.nlm.nih.gov/pubmed/29589559
http://dx.doi.org/10.1186/s12859-018-2060-2
work_keys_str_mv AT matlockkevin investigationofmodelstackingfordrugsensitivityprediction
AT denizcarlos investigationofmodelstackingfordrugsensitivityprediction
AT rahmanraziur investigationofmodelstackingfordrugsensitivityprediction
AT ghoshsouparno investigationofmodelstackingfordrugsensitivityprediction
AT palranadip investigationofmodelstackingfordrugsensitivityprediction