Cargando…

A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure

MOTIVATION: Next generation sequencing (NGS) technology has been widely used in biomedical research, particularly on those genomics-related studies. One of NGS applications is the high-throughput mRNA sequencing (RNA-seq), which is usually applied to evaluate gene expression level (i.e. copies of is...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yang, Wang, Junying, Wu, Song, Yang, Jie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9109925/
https://www.ncbi.nlm.nih.gov/pubmed/35576204
http://dx.doi.org/10.1371/journal.pone.0266162
_version_ 1784708986393591808
author Liu, Yang
Wang, Junying
Wu, Song
Yang, Jie
author_facet Liu, Yang
Wang, Junying
Wu, Song
Yang, Jie
author_sort Liu, Yang
collection PubMed
description MOTIVATION: Next generation sequencing (NGS) technology has been widely used in biomedical research, particularly on those genomics-related studies. One of NGS applications is the high-throughput mRNA sequencing (RNA-seq), which is usually applied to evaluate gene expression level (i.e. copies of isoforms), to identify differentially expressed genes, and to discover potential alternative splicing events. Popular tools for differential expression (DE) analysis using RNA-seq data include edgeR and DESeq. These methods tend to identify DE genes at the gene-level, which only allows them to compare the total size of isoforms, that is, sum of an isoform’s copy number times its length over all isoforms. Naturally, these methods may fail to detect DE genes when the total size of isoforms remains similar but isoform-wise expression levels change dramatically. Other tools can perform isoform-level DE analysis only if isoform structures are known but would still fail for many non-model species whose isoform information are missing. To overcome these disadvantages, we developed an isoform-free (without need to pre-specify isoform structures) splicing-graph based negative binomial (SGNB) model for differential expression analysis at isoform level. Our model detects not only the change in the total size of isoforms but also the change in the isoform-wise expression level and hence is more powerful. RESULTS: We performed extensive simulations to compare our method with edgeR and DESeq. Under various scenarios, our method consistently achieved a higher detection power, while controlling pre-specified type I error. We also applied our method to a real data set to illustrate its applicability in practice.
format Online
Article
Text
id pubmed-9109925
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-91099252022-05-17 A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure Liu, Yang Wang, Junying Wu, Song Yang, Jie PLoS One Research Article MOTIVATION: Next generation sequencing (NGS) technology has been widely used in biomedical research, particularly on those genomics-related studies. One of NGS applications is the high-throughput mRNA sequencing (RNA-seq), which is usually applied to evaluate gene expression level (i.e. copies of isoforms), to identify differentially expressed genes, and to discover potential alternative splicing events. Popular tools for differential expression (DE) analysis using RNA-seq data include edgeR and DESeq. These methods tend to identify DE genes at the gene-level, which only allows them to compare the total size of isoforms, that is, sum of an isoform’s copy number times its length over all isoforms. Naturally, these methods may fail to detect DE genes when the total size of isoforms remains similar but isoform-wise expression levels change dramatically. Other tools can perform isoform-level DE analysis only if isoform structures are known but would still fail for many non-model species whose isoform information are missing. To overcome these disadvantages, we developed an isoform-free (without need to pre-specify isoform structures) splicing-graph based negative binomial (SGNB) model for differential expression analysis at isoform level. Our model detects not only the change in the total size of isoforms but also the change in the isoform-wise expression level and hence is more powerful. RESULTS: We performed extensive simulations to compare our method with edgeR and DESeq. Under various scenarios, our method consistently achieved a higher detection power, while controlling pre-specified type I error. We also applied our method to a real data set to illustrate its applicability in practice. Public Library of Science 2022-05-16 /pmc/articles/PMC9109925/ /pubmed/35576204 http://dx.doi.org/10.1371/journal.pone.0266162 Text en © 2022 Liu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Liu, Yang
Wang, Junying
Wu, Song
Yang, Jie
A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure
title A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure
title_full A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure
title_fullStr A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure
title_full_unstemmed A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure
title_short A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure
title_sort model for isoform-level differential expression analysis using rna-seq data without pre-specifying isoform structure
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9109925/
https://www.ncbi.nlm.nih.gov/pubmed/35576204
http://dx.doi.org/10.1371/journal.pone.0266162
work_keys_str_mv AT liuyang amodelforisoformleveldifferentialexpressionanalysisusingrnaseqdatawithoutprespecifyingisoformstructure
AT wangjunying amodelforisoformleveldifferentialexpressionanalysisusingrnaseqdatawithoutprespecifyingisoformstructure
AT wusong amodelforisoformleveldifferentialexpressionanalysisusingrnaseqdatawithoutprespecifyingisoformstructure
AT yangjie amodelforisoformleveldifferentialexpressionanalysisusingrnaseqdatawithoutprespecifyingisoformstructure
AT liuyang modelforisoformleveldifferentialexpressionanalysisusingrnaseqdatawithoutprespecifyingisoformstructure
AT wangjunying modelforisoformleveldifferentialexpressionanalysisusingrnaseqdatawithoutprespecifyingisoformstructure
AT wusong modelforisoformleveldifferentialexpressionanalysisusingrnaseqdatawithoutprespecifyingisoformstructure
AT yangjie modelforisoformleveldifferentialexpressionanalysisusingrnaseqdatawithoutprespecifyingisoformstructure