Cargando…

Adversarial generation of gene expression data

MOTIVATION: High-throughput gene expression can be used to address a wide range of fundamental biological problems, but datasets of an appropriate size are often unavailable. Moreover, existing transcriptomics simulators have been criticized because they fail to emulate key properties of gene expres...

Descripción completa

Detalles Bibliográficos
Autores principales: Viñas, Ramon, Andrés-Terré, Helena, Liò, Pietro, Bryson, Kevin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8756177/
https://www.ncbi.nlm.nih.gov/pubmed/33471074
http://dx.doi.org/10.1093/bioinformatics/btab035
_version_ 1784632511988498432
author Viñas, Ramon
Andrés-Terré, Helena
Liò, Pietro
Bryson, Kevin
author_facet Viñas, Ramon
Andrés-Terré, Helena
Liò, Pietro
Bryson, Kevin
author_sort Viñas, Ramon
collection PubMed
description MOTIVATION: High-throughput gene expression can be used to address a wide range of fundamental biological problems, but datasets of an appropriate size are often unavailable. Moreover, existing transcriptomics simulators have been criticized because they fail to emulate key properties of gene expression data. In this article, we develop a method based on a conditional generative adversarial network to generate realistic transcriptomics data for Escherichia coli and humans. We assess the performance of our approach across several tissues and cancer-types. RESULTS: We show that our model preserves several gene expression properties significantly better than widely used simulators, such as SynTReN or GeneNetWeaver. The synthetic data preserve tissue- and cancer-specific properties of transcriptomics data. Moreover, it exhibits real gene clusters and ontologies both at local and global scales, suggesting that the model learns to approximate the gene expression manifold in a biologically meaningful way. AVAILABILITY AND IMPLEMENTATION: Code is available at: https://github.com/rvinas/adversarial-gene-expression. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8756177
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87561772022-01-13 Adversarial generation of gene expression data Viñas, Ramon Andrés-Terré, Helena Liò, Pietro Bryson, Kevin Bioinformatics Original Papers MOTIVATION: High-throughput gene expression can be used to address a wide range of fundamental biological problems, but datasets of an appropriate size are often unavailable. Moreover, existing transcriptomics simulators have been criticized because they fail to emulate key properties of gene expression data. In this article, we develop a method based on a conditional generative adversarial network to generate realistic transcriptomics data for Escherichia coli and humans. We assess the performance of our approach across several tissues and cancer-types. RESULTS: We show that our model preserves several gene expression properties significantly better than widely used simulators, such as SynTReN or GeneNetWeaver. The synthetic data preserve tissue- and cancer-specific properties of transcriptomics data. Moreover, it exhibits real gene clusters and ontologies both at local and global scales, suggesting that the model learns to approximate the gene expression manifold in a biologically meaningful way. AVAILABILITY AND IMPLEMENTATION: Code is available at: https://github.com/rvinas/adversarial-gene-expression. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-01-20 /pmc/articles/PMC8756177/ /pubmed/33471074 http://dx.doi.org/10.1093/bioinformatics/btab035 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Viñas, Ramon
Andrés-Terré, Helena
Liò, Pietro
Bryson, Kevin
Adversarial generation of gene expression data
title Adversarial generation of gene expression data
title_full Adversarial generation of gene expression data
title_fullStr Adversarial generation of gene expression data
title_full_unstemmed Adversarial generation of gene expression data
title_short Adversarial generation of gene expression data
title_sort adversarial generation of gene expression data
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8756177/
https://www.ncbi.nlm.nih.gov/pubmed/33471074
http://dx.doi.org/10.1093/bioinformatics/btab035
work_keys_str_mv AT vinasramon adversarialgenerationofgeneexpressiondata
AT andresterrehelena adversarialgenerationofgeneexpressiondata
AT liopietro adversarialgenerationofgeneexpressiondata
AT brysonkevin adversarialgenerationofgeneexpressiondata