Cargando…

Nonparametric expression analysis using inferential replicate counts

A primary challenge in the analysis of RNA-seq data is to identify differentially expressed genes or transcripts while controlling for technical biases. Ideally, a statistical testing procedure should incorporate the inherent uncertainty of the abundance estimates arising from the quantification ste...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Anqi, Srivastava, Avi, Ibrahim, Joseph G, Patro, Rob, Love, Michael I
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6765120/
https://www.ncbi.nlm.nih.gov/pubmed/31372651
http://dx.doi.org/10.1093/nar/gkz622
_version_ 1783454505791127552
author Zhu, Anqi
Srivastava, Avi
Ibrahim, Joseph G
Patro, Rob
Love, Michael I
author_facet Zhu, Anqi
Srivastava, Avi
Ibrahim, Joseph G
Patro, Rob
Love, Michael I
author_sort Zhu, Anqi
collection PubMed
description A primary challenge in the analysis of RNA-seq data is to identify differentially expressed genes or transcripts while controlling for technical biases. Ideally, a statistical testing procedure should incorporate the inherent uncertainty of the abundance estimates arising from the quantification step. Most popular methods for RNA-seq differential expression analysis fit a parametric model to the counts for each gene or transcript, and a subset of methods can incorporate uncertainty. Previous work has shown that nonparametric models for RNA-seq differential expression may have better control of the false discovery rate, and adapt well to new data types without requiring reformulation of a parametric model. Existing nonparametric models do not take into account inferential uncertainty, leading to an inflated false discovery rate, in particular at the transcript level. We propose a nonparametric model for differential expression analysis using inferential replicate counts, extending the existing SAMseq method to account for inferential uncertainty. We compare our method, Swish, with popular differential expression analysis methods. Swish has improved control of the false discovery rate, in particular for transcripts with high inferential uncertainty. We apply Swish to a single-cell RNA-seq dataset, assessing differential expression between sub-populations of cells, and compare its performance to the Wilcoxon test.
format Online
Article
Text
id pubmed-6765120
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-67651202019-10-02 Nonparametric expression analysis using inferential replicate counts Zhu, Anqi Srivastava, Avi Ibrahim, Joseph G Patro, Rob Love, Michael I Nucleic Acids Res Methods Online A primary challenge in the analysis of RNA-seq data is to identify differentially expressed genes or transcripts while controlling for technical biases. Ideally, a statistical testing procedure should incorporate the inherent uncertainty of the abundance estimates arising from the quantification step. Most popular methods for RNA-seq differential expression analysis fit a parametric model to the counts for each gene or transcript, and a subset of methods can incorporate uncertainty. Previous work has shown that nonparametric models for RNA-seq differential expression may have better control of the false discovery rate, and adapt well to new data types without requiring reformulation of a parametric model. Existing nonparametric models do not take into account inferential uncertainty, leading to an inflated false discovery rate, in particular at the transcript level. We propose a nonparametric model for differential expression analysis using inferential replicate counts, extending the existing SAMseq method to account for inferential uncertainty. We compare our method, Swish, with popular differential expression analysis methods. Swish has improved control of the false discovery rate, in particular for transcripts with high inferential uncertainty. We apply Swish to a single-cell RNA-seq dataset, assessing differential expression between sub-populations of cells, and compare its performance to the Wilcoxon test. Oxford University Press 2019-08-02 /pmc/articles/PMC6765120/ /pubmed/31372651 http://dx.doi.org/10.1093/nar/gkz622 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Zhu, Anqi
Srivastava, Avi
Ibrahim, Joseph G
Patro, Rob
Love, Michael I
Nonparametric expression analysis using inferential replicate counts
title Nonparametric expression analysis using inferential replicate counts
title_full Nonparametric expression analysis using inferential replicate counts
title_fullStr Nonparametric expression analysis using inferential replicate counts
title_full_unstemmed Nonparametric expression analysis using inferential replicate counts
title_short Nonparametric expression analysis using inferential replicate counts
title_sort nonparametric expression analysis using inferential replicate counts
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6765120/
https://www.ncbi.nlm.nih.gov/pubmed/31372651
http://dx.doi.org/10.1093/nar/gkz622
work_keys_str_mv AT zhuanqi nonparametricexpressionanalysisusinginferentialreplicatecounts
AT srivastavaavi nonparametricexpressionanalysisusinginferentialreplicatecounts
AT ibrahimjosephg nonparametricexpressionanalysisusinginferentialreplicatecounts
AT patrorob nonparametricexpressionanalysisusinginferentialreplicatecounts
AT lovemichaeli nonparametricexpressionanalysisusinginferentialreplicatecounts