Cargando…
CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates
BACKGROUND: In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample lib...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751784/ https://www.ncbi.nlm.nih.gov/pubmed/29297307 http://dx.doi.org/10.1186/s12859-017-1974-4 |
_version_ | 1783290017713487872 |
---|---|
author | Low, Joel Z. B. Khang, Tsung Fei Tammi, Martti T. |
author_facet | Low, Joel Z. B. Khang, Tsung Fei Tammi, Martti T. |
author_sort | Low, Joel Z. B. |
collection | PubMed |
description | BACKGROUND: In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample library sizes, and then the resulting normalized gene counts are used as input for parametric or non-parametric differential gene expression tests. A distribution of true gene counts, each with a different probability, can result in the same observed gene count. Importantly, sequencing coverage information is currently not explicitly incorporated into any of the statistical models used for RNA-Seq analysis. RESULTS: We developed a fast Bayesian method which uses the sequencing coverage information determined from the concentration of an RNA sample to estimate the posterior distribution of a true gene count. Our method has better or comparable performance compared to NOISeq and GFOLD, according to the results from simulations and experiments with real unreplicated data. We incorporated a previously unused sequencing coverage parameter into a procedure for differential gene expression analysis with RNA-Seq data. CONCLUSIONS: Our results suggest that our method can be used to overcome analytical bottlenecks in experiments with limited number of replicates and low sequencing coverage. The method is implemented in CORNAS (Coverage-dependent RNA-Seq), and is available at https://github.com/joel-lzb/CORNAS. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1974-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5751784 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-57517842018-01-05 CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates Low, Joel Z. B. Khang, Tsung Fei Tammi, Martti T. BMC Bioinformatics Research BACKGROUND: In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample library sizes, and then the resulting normalized gene counts are used as input for parametric or non-parametric differential gene expression tests. A distribution of true gene counts, each with a different probability, can result in the same observed gene count. Importantly, sequencing coverage information is currently not explicitly incorporated into any of the statistical models used for RNA-Seq analysis. RESULTS: We developed a fast Bayesian method which uses the sequencing coverage information determined from the concentration of an RNA sample to estimate the posterior distribution of a true gene count. Our method has better or comparable performance compared to NOISeq and GFOLD, according to the results from simulations and experiments with real unreplicated data. We incorporated a previously unused sequencing coverage parameter into a procedure for differential gene expression analysis with RNA-Seq data. CONCLUSIONS: Our results suggest that our method can be used to overcome analytical bottlenecks in experiments with limited number of replicates and low sequencing coverage. The method is implemented in CORNAS (Coverage-dependent RNA-Seq), and is available at https://github.com/joel-lzb/CORNAS. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1974-4) contains supplementary material, which is available to authorized users. BioMed Central 2017-12-28 /pmc/articles/PMC5751784/ /pubmed/29297307 http://dx.doi.org/10.1186/s12859-017-1974-4 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Low, Joel Z. B. Khang, Tsung Fei Tammi, Martti T. CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates |
title | CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates |
title_full | CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates |
title_fullStr | CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates |
title_full_unstemmed | CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates |
title_short | CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates |
title_sort | cornas: coverage-dependent rna-seq analysis of gene expression data without biological replicates |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751784/ https://www.ncbi.nlm.nih.gov/pubmed/29297307 http://dx.doi.org/10.1186/s12859-017-1974-4 |
work_keys_str_mv | AT lowjoelzb cornascoveragedependentrnaseqanalysisofgeneexpressiondatawithoutbiologicalreplicates AT khangtsungfei cornascoveragedependentrnaseqanalysisofgeneexpressiondatawithoutbiologicalreplicates AT tammimarttit cornascoveragedependentrnaseqanalysisofgeneexpressiondatawithoutbiologicalreplicates |