Cargando…

Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data

Precision medicine in oncology aims at obtaining data from heterogeneous sources to have a precise estimation of a given patient’s state and prognosis. With the purpose of advancing to personalized medicine framework, accurate diagnoses allow prescription of more effective treatments adapted to the...

Descripción completa

Detalles Bibliográficos
Autores principales:	López-García, Guillermo, Jerez, José M., Franco, Leonardo, Veredas, Francisco J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2020
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7098575/ https://www.ncbi.nlm.nih.gov/pubmed/32214348 http://dx.doi.org/10.1371/journal.pone.0230536

_version_	1783511201982971904
author	López-García, Guillermo Jerez, José M. Franco, Leonardo Veredas, Francisco J.
author_facet	López-García, Guillermo Jerez, José M. Franco, Leonardo Veredas, Francisco J.
author_sort	López-García, Guillermo
collection	PubMed
description	Precision medicine in oncology aims at obtaining data from heterogeneous sources to have a precise estimation of a given patient’s state and prognosis. With the purpose of advancing to personalized medicine framework, accurate diagnoses allow prescription of more effective treatments adapted to the specificities of each individual case. In the last years, next-generation sequencing has impelled cancer research by providing physicians with an overwhelming amount of gene-expression data from RNA-seq high-throughput platforms. In this scenario, data mining and machine learning techniques have widely contribute to gene-expression data analysis by supplying computational models to supporting decision-making on real-world data. Nevertheless, existing public gene-expression databases are characterized by the unfavorable imbalance between the huge number of genes (in the order of tenths of thousands) and the small number of samples (in the order of a few hundreds) available. Despite diverse feature selection and extraction strategies have been traditionally applied to surpass derived over-fitting issues, the efficacy of standard machine learning pipelines is far from being satisfactory for the prediction of relevant clinical outcomes like follow-up end-points or patient’s survival. Using the public Pan-Cancer dataset, in this study we pre-train convolutional neural network architectures for survival prediction on a subset composed of thousands of gene-expression samples from thirty-one tumor types. The resulting architectures are subsequently fine-tuned to predict lung cancer progression-free interval. The application of convolutional networks to gene-expression data has many limitations, derived from the unstructured nature of these data. In this work we propose a methodology to rearrange RNA-seq data by transforming RNA-seq samples into gene-expression images, from which convolutional networks can extract high-level features. As an additional objective, we investigate whether leveraging the information extracted from other tumor-type samples contributes to the extraction of high-level features that improve lung cancer progression prediction, compared to other machine learning approaches.
format	Online Article Text
id	pubmed-7098575
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-70985752020-04-03 Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data López-García, Guillermo Jerez, José M. Franco, Leonardo Veredas, Francisco J. PLoS One Research Article Precision medicine in oncology aims at obtaining data from heterogeneous sources to have a precise estimation of a given patient’s state and prognosis. With the purpose of advancing to personalized medicine framework, accurate diagnoses allow prescription of more effective treatments adapted to the specificities of each individual case. In the last years, next-generation sequencing has impelled cancer research by providing physicians with an overwhelming amount of gene-expression data from RNA-seq high-throughput platforms. In this scenario, data mining and machine learning techniques have widely contribute to gene-expression data analysis by supplying computational models to supporting decision-making on real-world data. Nevertheless, existing public gene-expression databases are characterized by the unfavorable imbalance between the huge number of genes (in the order of tenths of thousands) and the small number of samples (in the order of a few hundreds) available. Despite diverse feature selection and extraction strategies have been traditionally applied to surpass derived over-fitting issues, the efficacy of standard machine learning pipelines is far from being satisfactory for the prediction of relevant clinical outcomes like follow-up end-points or patient’s survival. Using the public Pan-Cancer dataset, in this study we pre-train convolutional neural network architectures for survival prediction on a subset composed of thousands of gene-expression samples from thirty-one tumor types. The resulting architectures are subsequently fine-tuned to predict lung cancer progression-free interval. The application of convolutional networks to gene-expression data has many limitations, derived from the unstructured nature of these data. In this work we propose a methodology to rearrange RNA-seq data by transforming RNA-seq samples into gene-expression images, from which convolutional networks can extract high-level features. As an additional objective, we investigate whether leveraging the information extracted from other tumor-type samples contributes to the extraction of high-level features that improve lung cancer progression prediction, compared to other machine learning approaches. Public Library of Science 2020-03-26 /pmc/articles/PMC7098575/ /pubmed/32214348 http://dx.doi.org/10.1371/journal.pone.0230536 Text en © 2020 López-García et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article López-García, Guillermo Jerez, José M. Franco, Leonardo Veredas, Francisco J. Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data
title	Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data
title_full	Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data
title_fullStr	Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data
title_full_unstemmed	Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data
title_short	Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data
title_sort	transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7098575/ https://www.ncbi.nlm.nih.gov/pubmed/32214348 http://dx.doi.org/10.1371/journal.pone.0230536
work_keys_str_mv	AT lopezgarciaguillermo transferlearningwithconvolutionalneuralnetworksforcancersurvivalpredictionusinggeneexpressiondata AT jerezjosem transferlearningwithconvolutionalneuralnetworksforcancersurvivalpredictionusinggeneexpressiondata AT francoleonardo transferlearningwithconvolutionalneuralnetworksforcancersurvivalpredictionusinggeneexpressiondata AT veredasfranciscoj transferlearningwithconvolutionalneuralnetworksforcancersurvivalpredictionusinggeneexpressiondata

Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data

Ejemplares similares