Cargando…

Interpretable per case weighted ensemble method for cancer associations

BACKGROUND: Molecular measurements from cancer patients such as gene expression and DNA methylation can be influenced by several external factors. This makes it harder to reproduce the exact values of measurements coming from different laboratories. Furthermore, some cancer types are very heterogene...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jalali, Adrin, Pfeifer, Nico
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4952276/ https://www.ncbi.nlm.nih.gov/pubmed/27435615 http://dx.doi.org/10.1186/s12864-016-2647-9

_version_	1782443785915990016
author	Jalali, Adrin Pfeifer, Nico
author_facet	Jalali, Adrin Pfeifer, Nico
author_sort	Jalali, Adrin
collection	PubMed
description	BACKGROUND: Molecular measurements from cancer patients such as gene expression and DNA methylation can be influenced by several external factors. This makes it harder to reproduce the exact values of measurements coming from different laboratories. Furthermore, some cancer types are very heterogeneous, meaning that there might be different underlying causes for the same type of cancer among different individuals. If a model does not take potential biases in the data into account, this can lead to problems when trying to predict the stage of a certain cancer type. This is especially true when these biases differ between the training and test set. RESULTS: We introduce a method that can estimate this bias on a per-feature level and incorporate calculated feature confidences into a weighted combination of classifiers with disjoint feature sets. In this way, the method provides a prediction that is adjusted for the potential biases on a per-patient basis, providing a personalized prediction for each test patient. The new method achieves state-of-the-art performance on many different cancer data sets with measured DNA methylation or gene expression. Moreover, we show how to visualize the learned classifiers to display interesting associations with the target label. Applied to a leukemia data set, our method finds several ribosomal proteins associated with the risk group, which might be interesting targets for follow-up studies. This discovery supports the hypothesis that the ribosomes are a new frontier in genadaptivelearninge regulation. CONCLUSION: We introduce a new method for robust prediction of phenotypes from molecular measurements such as DNA methylation or gene expression. Furthermore, the visualization capabilities enable exploratory analysis on the learnt dependencies and pave the way for a personalized prediction of phenotypes. The software is available under GPL2+ from https://github.com/adrinjalali/Network-Classifier/tree/v1.0. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2647-9) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4952276
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-49522762016-07-21 Interpretable per case weighted ensemble method for cancer associations Jalali, Adrin Pfeifer, Nico BMC Genomics Methodology Article BACKGROUND: Molecular measurements from cancer patients such as gene expression and DNA methylation can be influenced by several external factors. This makes it harder to reproduce the exact values of measurements coming from different laboratories. Furthermore, some cancer types are very heterogeneous, meaning that there might be different underlying causes for the same type of cancer among different individuals. If a model does not take potential biases in the data into account, this can lead to problems when trying to predict the stage of a certain cancer type. This is especially true when these biases differ between the training and test set. RESULTS: We introduce a method that can estimate this bias on a per-feature level and incorporate calculated feature confidences into a weighted combination of classifiers with disjoint feature sets. In this way, the method provides a prediction that is adjusted for the potential biases on a per-patient basis, providing a personalized prediction for each test patient. The new method achieves state-of-the-art performance on many different cancer data sets with measured DNA methylation or gene expression. Moreover, we show how to visualize the learned classifiers to display interesting associations with the target label. Applied to a leukemia data set, our method finds several ribosomal proteins associated with the risk group, which might be interesting targets for follow-up studies. This discovery supports the hypothesis that the ribosomes are a new frontier in genadaptivelearninge regulation. CONCLUSION: We introduce a new method for robust prediction of phenotypes from molecular measurements such as DNA methylation or gene expression. Furthermore, the visualization capabilities enable exploratory analysis on the learnt dependencies and pave the way for a personalized prediction of phenotypes. The software is available under GPL2+ from https://github.com/adrinjalali/Network-Classifier/tree/v1.0. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2647-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-07-19 /pmc/articles/PMC4952276/ /pubmed/27435615 http://dx.doi.org/10.1186/s12864-016-2647-9 Text en © Jalali and Pfeifer. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Jalali, Adrin Pfeifer, Nico Interpretable per case weighted ensemble method for cancer associations
title	Interpretable per case weighted ensemble method for cancer associations
title_full	Interpretable per case weighted ensemble method for cancer associations
title_fullStr	Interpretable per case weighted ensemble method for cancer associations
title_full_unstemmed	Interpretable per case weighted ensemble method for cancer associations
title_short	Interpretable per case weighted ensemble method for cancer associations
title_sort	interpretable per case weighted ensemble method for cancer associations
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4952276/ https://www.ncbi.nlm.nih.gov/pubmed/27435615 http://dx.doi.org/10.1186/s12864-016-2647-9
work_keys_str_mv	AT jalaliadrin interpretablepercaseweightedensemblemethodforcancerassociations AT pfeifernico interpretablepercaseweightedensemblemethodforcancerassociations

Interpretable per case weighted ensemble method for cancer associations

Ejemplares similares