Cargando…

Deep Learning Enables Fast and Accurate Imputation of Gene Expression

A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods,...

Descripción completa

Detalles Bibliográficos
Autores principales: Viñas, Ramon, Azevedo, Tiago, Gamazon, Eric R., Liò, Pietro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8076954/
https://www.ncbi.nlm.nih.gov/pubmed/33927746
http://dx.doi.org/10.3389/fgene.2021.624128
_version_ 1783684791963484160
author Viñas, Ramon
Azevedo, Tiago
Gamazon, Eric R.
Liò, Pietro
author_facet Viñas, Ramon
Azevedo, Tiago
Gamazon, Eric R.
Liò, Pietro
author_sort Viñas, Ramon
collection PubMed
description A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.
format Online
Article
Text
id pubmed-8076954
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-80769542021-04-28 Deep Learning Enables Fast and Accurate Imputation of Gene Expression Viñas, Ramon Azevedo, Tiago Gamazon, Eric R. Liò, Pietro Front Genet Genetics A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types. Frontiers Media S.A. 2021-04-13 /pmc/articles/PMC8076954/ /pubmed/33927746 http://dx.doi.org/10.3389/fgene.2021.624128 Text en Copyright © 2021 Viñas, Azevedo, Gamazon and Liò. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Viñas, Ramon
Azevedo, Tiago
Gamazon, Eric R.
Liò, Pietro
Deep Learning Enables Fast and Accurate Imputation of Gene Expression
title Deep Learning Enables Fast and Accurate Imputation of Gene Expression
title_full Deep Learning Enables Fast and Accurate Imputation of Gene Expression
title_fullStr Deep Learning Enables Fast and Accurate Imputation of Gene Expression
title_full_unstemmed Deep Learning Enables Fast and Accurate Imputation of Gene Expression
title_short Deep Learning Enables Fast and Accurate Imputation of Gene Expression
title_sort deep learning enables fast and accurate imputation of gene expression
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8076954/
https://www.ncbi.nlm.nih.gov/pubmed/33927746
http://dx.doi.org/10.3389/fgene.2021.624128
work_keys_str_mv AT vinasramon deeplearningenablesfastandaccurateimputationofgeneexpression
AT azevedotiago deeplearningenablesfastandaccurateimputationofgeneexpression
AT gamazonericr deeplearningenablesfastandaccurateimputationofgeneexpression
AT liopietro deeplearningenablesfastandaccurateimputationofgeneexpression