Cargando…

Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma

We introduce the CDRP (Concatenated Diagnostic-Relapse Prognostic) architecture for multi-task deep learning that incorporates a clinical algorithm, e.g., a risk stratification schema to improve prognostic profiling. We present the first application to survival prediction in High-Risk (HR) Neuroblas...

Descripción completa

Detalles Bibliográficos
Autores principales:	Maggio, Valerio, Chierici, Marco, Jurman, Giuseppe, Furlanello, Cesare
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2018
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6285384/ https://www.ncbi.nlm.nih.gov/pubmed/30532223 http://dx.doi.org/10.1371/journal.pone.0208924

_version_	1783379391062999040
author	Maggio, Valerio Chierici, Marco Jurman, Giuseppe Furlanello, Cesare
author_facet	Maggio, Valerio Chierici, Marco Jurman, Giuseppe Furlanello, Cesare
author_sort	Maggio, Valerio
collection	PubMed
description	We introduce the CDRP (Concatenated Diagnostic-Relapse Prognostic) architecture for multi-task deep learning that incorporates a clinical algorithm, e.g., a risk stratification schema to improve prognostic profiling. We present the first application to survival prediction in High-Risk (HR) Neuroblastoma from transcriptomics data, a task that studies from the MAQC consortium have shown to remain the hardest among multiple diagnostic and prognostic endpoints predictable from the same dataset. To obtain a more accurate risk stratification needed for appropriate treatment strategies, CDRP combines a first component (CDRP-A) synthesizing a diagnostic task and a second component (CDRP-N) dedicated to one or more prognostic tasks. The approach leverages the advent of semi-supervised deep learning structures that can flexibly integrate multimodal data or internally create multiple processing paths. CDRP-A is an autoencoder trained on gene expression on the HR/non-HR risk stratification by the Children’s Oncology Group, obtaining a 64-node representation in the bottleneck layer. CDRP-N is a multi-task classifier for two prognostic endpoints, i.e., Event-Free Survival (EFS) and Overall Survival (OS). CDRP-A provides the HR embedding input to the CDRP-N shared layer, from which two branches depart to model EFS and OS, respectively. To control for selection bias, CDRP is trained and evaluated using a Data Analysis Protocol (DAP) developed within the MAQC initiative. CDRP was applied on Illumina RNA-Seq of 498 Neuroblastoma patients (HR: 176) from the SEQC study (12,464 Entrez genes) and on Affymetrix Human Exon Array expression profiles (17,450 genes) of 247 primary diagnostic Neuroblastoma of the TARGET NBL cohort. On the SEQC HR patients, CDRP achieves Matthews Correlation Coefficient (MCC) 0.38 for EFS and MCC = 0.19 for OS in external validation, improving over published SEQC models. We show that a CDRP-N embedding is indeed parametrically associated to increasing severity and the embedding can be used to better stratify patients’ survival.
format	Online Article Text
id	pubmed-6285384
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-62853842018-12-28 Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma Maggio, Valerio Chierici, Marco Jurman, Giuseppe Furlanello, Cesare PLoS One Research Article We introduce the CDRP (Concatenated Diagnostic-Relapse Prognostic) architecture for multi-task deep learning that incorporates a clinical algorithm, e.g., a risk stratification schema to improve prognostic profiling. We present the first application to survival prediction in High-Risk (HR) Neuroblastoma from transcriptomics data, a task that studies from the MAQC consortium have shown to remain the hardest among multiple diagnostic and prognostic endpoints predictable from the same dataset. To obtain a more accurate risk stratification needed for appropriate treatment strategies, CDRP combines a first component (CDRP-A) synthesizing a diagnostic task and a second component (CDRP-N) dedicated to one or more prognostic tasks. The approach leverages the advent of semi-supervised deep learning structures that can flexibly integrate multimodal data or internally create multiple processing paths. CDRP-A is an autoencoder trained on gene expression on the HR/non-HR risk stratification by the Children’s Oncology Group, obtaining a 64-node representation in the bottleneck layer. CDRP-N is a multi-task classifier for two prognostic endpoints, i.e., Event-Free Survival (EFS) and Overall Survival (OS). CDRP-A provides the HR embedding input to the CDRP-N shared layer, from which two branches depart to model EFS and OS, respectively. To control for selection bias, CDRP is trained and evaluated using a Data Analysis Protocol (DAP) developed within the MAQC initiative. CDRP was applied on Illumina RNA-Seq of 498 Neuroblastoma patients (HR: 176) from the SEQC study (12,464 Entrez genes) and on Affymetrix Human Exon Array expression profiles (17,450 genes) of 247 primary diagnostic Neuroblastoma of the TARGET NBL cohort. On the SEQC HR patients, CDRP achieves Matthews Correlation Coefficient (MCC) 0.38 for EFS and MCC = 0.19 for OS in external validation, improving over published SEQC models. We show that a CDRP-N embedding is indeed parametrically associated to increasing severity and the embedding can be used to better stratify patients’ survival. Public Library of Science 2018-12-07 /pmc/articles/PMC6285384/ /pubmed/30532223 http://dx.doi.org/10.1371/journal.pone.0208924 Text en © 2018 Maggio et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Maggio, Valerio Chierici, Marco Jurman, Giuseppe Furlanello, Cesare Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma
title	Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma
title_full	Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma
title_fullStr	Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma
title_full_unstemmed	Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma
title_short	Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma
title_sort	distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk neuroblastoma
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6285384/ https://www.ncbi.nlm.nih.gov/pubmed/30532223 http://dx.doi.org/10.1371/journal.pone.0208924
work_keys_str_mv	AT maggiovalerio distillationoftheclinicalalgorithmimprovesprognosisbymultitaskdeeplearninginhighriskneuroblastoma AT chiericimarco distillationoftheclinicalalgorithmimprovesprognosisbymultitaskdeeplearninginhighriskneuroblastoma AT jurmangiuseppe distillationoftheclinicalalgorithmimprovesprognosisbymultitaskdeeplearninginhighriskneuroblastoma AT furlanellocesare distillationoftheclinicalalgorithmimprovesprognosisbymultitaskdeeplearninginhighriskneuroblastoma

Distillation of the clinical algorithm improves prognosis by multi-task deep learning in high-risk Neuroblastoma

Ejemplares similares