Cargando…

ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq

Experimental variance is a major challenge when dealing with high-throughput sequencing data. This variance has several sources: sampling replication, technical replication, variability within biological conditions, and variability between biological conditions. The high per-sample cost of RNA-Seq o...

Descripción completa

Detalles Bibliográficos
Autores principales: Fernandes, Andrew D., Macklaim, Jean M., Linn, Thomas G., Reid, Gregor, Gloor, Gregory B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3699591/
https://www.ncbi.nlm.nih.gov/pubmed/23843979
http://dx.doi.org/10.1371/journal.pone.0067019
_version_ 1782275418864222208
author Fernandes, Andrew D.
Macklaim, Jean M.
Linn, Thomas G.
Reid, Gregor
Gloor, Gregory B.
author_facet Fernandes, Andrew D.
Macklaim, Jean M.
Linn, Thomas G.
Reid, Gregor
Gloor, Gregory B.
author_sort Fernandes, Andrew D.
collection PubMed
description Experimental variance is a major challenge when dealing with high-throughput sequencing data. This variance has several sources: sampling replication, technical replication, variability within biological conditions, and variability between biological conditions. The high per-sample cost of RNA-Seq often precludes the large number of experiments needed to partition observed variance into these categories as per standard ANOVA models. We show that the partitioning of within-condition to between-condition variation cannot reasonably be ignored, whether in single-organism RNA-Seq or in Meta-RNA-Seq experiments, and further find that commonly-used RNA-Seq analysis tools, as described in the literature, do not enforce the constraint that the sum of relative expression levels must be one, and thus report expression levels that are systematically distorted. These two factors lead to misleading inferences if not properly accommodated. As it is usually only the biological between-condition and within-condition differences that are of interest, we developed ALDEx, an ANOVA-like differential expression procedure, to identify genes with greater between- to within-condition differences. We show that the presence of differential expression and the magnitude of these comparative differences can be reasonably estimated with even very small sample sizes.
format Online
Article
Text
id pubmed-3699591
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36995912013-07-10 ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq Fernandes, Andrew D. Macklaim, Jean M. Linn, Thomas G. Reid, Gregor Gloor, Gregory B. PLoS One Research Article Experimental variance is a major challenge when dealing with high-throughput sequencing data. This variance has several sources: sampling replication, technical replication, variability within biological conditions, and variability between biological conditions. The high per-sample cost of RNA-Seq often precludes the large number of experiments needed to partition observed variance into these categories as per standard ANOVA models. We show that the partitioning of within-condition to between-condition variation cannot reasonably be ignored, whether in single-organism RNA-Seq or in Meta-RNA-Seq experiments, and further find that commonly-used RNA-Seq analysis tools, as described in the literature, do not enforce the constraint that the sum of relative expression levels must be one, and thus report expression levels that are systematically distorted. These two factors lead to misleading inferences if not properly accommodated. As it is usually only the biological between-condition and within-condition differences that are of interest, we developed ALDEx, an ANOVA-like differential expression procedure, to identify genes with greater between- to within-condition differences. We show that the presence of differential expression and the magnitude of these comparative differences can be reasonably estimated with even very small sample sizes. Public Library of Science 2013-07-02 /pmc/articles/PMC3699591/ /pubmed/23843979 http://dx.doi.org/10.1371/journal.pone.0067019 Text en © 2013 Fernandes et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Fernandes, Andrew D.
Macklaim, Jean M.
Linn, Thomas G.
Reid, Gregor
Gloor, Gregory B.
ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq
title ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq
title_full ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq
title_fullStr ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq
title_full_unstemmed ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq
title_short ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq
title_sort anova-like differential expression (aldex) analysis for mixed population rna-seq
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3699591/
https://www.ncbi.nlm.nih.gov/pubmed/23843979
http://dx.doi.org/10.1371/journal.pone.0067019
work_keys_str_mv AT fernandesandrewd anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq
AT macklaimjeanm anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq
AT linnthomasg anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq
AT reidgregor anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq
AT gloorgregoryb anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq