Cargando…

Variance component testing for identifying differentially expressed genes in RNA-seq data

RNA sequencing (RNA-Seq) enables the measurement and comparison of gene expression with isoform-level quantification. Differences in the effect of each isoform may make traditional methods, which aggregate isoforms, ineffective. Here, we introduce a variance component-based test that can jointly tes...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Sheng, Shao, Fang, Duan, Weiwei, Zhao, Yang, Chen, Feng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5592911/
https://www.ncbi.nlm.nih.gov/pubmed/28929020
http://dx.doi.org/10.7717/peerj.3797
_version_ 1783262958317469696
author Yang, Sheng
Shao, Fang
Duan, Weiwei
Zhao, Yang
Chen, Feng
author_facet Yang, Sheng
Shao, Fang
Duan, Weiwei
Zhao, Yang
Chen, Feng
author_sort Yang, Sheng
collection PubMed
description RNA sequencing (RNA-Seq) enables the measurement and comparison of gene expression with isoform-level quantification. Differences in the effect of each isoform may make traditional methods, which aggregate isoforms, ineffective. Here, we introduce a variance component-based test that can jointly test multiple isoforms of one gene to identify differentially expressed (DE) genes, especially those with isoforms that have differential effects. We model isoform-level expression data from RNA-Seq using a negative binomial distribution and consider the baseline abundance of isoforms and their effects as two random terms. Our approach tests the global null hypothesis of no difference in any of the isoforms. The null distribution of the derived score statistic is investigated using empirical and theoretical methods. The results of simulations suggest that the performance of the proposed set test is superior to that of traditional algorithms and almost reaches optimal power when the variance of covariates is large. This method is also applied to analyze real data. Our algorithm, as a supplement to traditional algorithms, is superior at selecting DE genes with sparse or opposite effects for isoforms.
format Online
Article
Text
id pubmed-5592911
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-55929112017-09-19 Variance component testing for identifying differentially expressed genes in RNA-seq data Yang, Sheng Shao, Fang Duan, Weiwei Zhao, Yang Chen, Feng PeerJ Bioinformatics RNA sequencing (RNA-Seq) enables the measurement and comparison of gene expression with isoform-level quantification. Differences in the effect of each isoform may make traditional methods, which aggregate isoforms, ineffective. Here, we introduce a variance component-based test that can jointly test multiple isoforms of one gene to identify differentially expressed (DE) genes, especially those with isoforms that have differential effects. We model isoform-level expression data from RNA-Seq using a negative binomial distribution and consider the baseline abundance of isoforms and their effects as two random terms. Our approach tests the global null hypothesis of no difference in any of the isoforms. The null distribution of the derived score statistic is investigated using empirical and theoretical methods. The results of simulations suggest that the performance of the proposed set test is superior to that of traditional algorithms and almost reaches optimal power when the variance of covariates is large. This method is also applied to analyze real data. Our algorithm, as a supplement to traditional algorithms, is superior at selecting DE genes with sparse or opposite effects for isoforms. PeerJ Inc. 2017-09-08 /pmc/articles/PMC5592911/ /pubmed/28929020 http://dx.doi.org/10.7717/peerj.3797 Text en ©2017 Yang et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Yang, Sheng
Shao, Fang
Duan, Weiwei
Zhao, Yang
Chen, Feng
Variance component testing for identifying differentially expressed genes in RNA-seq data
title Variance component testing for identifying differentially expressed genes in RNA-seq data
title_full Variance component testing for identifying differentially expressed genes in RNA-seq data
title_fullStr Variance component testing for identifying differentially expressed genes in RNA-seq data
title_full_unstemmed Variance component testing for identifying differentially expressed genes in RNA-seq data
title_short Variance component testing for identifying differentially expressed genes in RNA-seq data
title_sort variance component testing for identifying differentially expressed genes in rna-seq data
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5592911/
https://www.ncbi.nlm.nih.gov/pubmed/28929020
http://dx.doi.org/10.7717/peerj.3797
work_keys_str_mv AT yangsheng variancecomponenttestingforidentifyingdifferentiallyexpressedgenesinrnaseqdata
AT shaofang variancecomponenttestingforidentifyingdifferentiallyexpressedgenesinrnaseqdata
AT duanweiwei variancecomponenttestingforidentifyingdifferentiallyexpressedgenesinrnaseqdata
AT zhaoyang variancecomponenttestingforidentifyingdifferentiallyexpressedgenesinrnaseqdata
AT chenfeng variancecomponenttestingforidentifyingdifferentiallyexpressedgenesinrnaseqdata