Cargando…
ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq
Experimental variance is a major challenge when dealing with high-throughput sequencing data. This variance has several sources: sampling replication, technical replication, variability within biological conditions, and variability between biological conditions. The high per-sample cost of RNA-Seq o...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3699591/ https://www.ncbi.nlm.nih.gov/pubmed/23843979 http://dx.doi.org/10.1371/journal.pone.0067019 |
_version_ | 1782275418864222208 |
---|---|
author | Fernandes, Andrew D. Macklaim, Jean M. Linn, Thomas G. Reid, Gregor Gloor, Gregory B. |
author_facet | Fernandes, Andrew D. Macklaim, Jean M. Linn, Thomas G. Reid, Gregor Gloor, Gregory B. |
author_sort | Fernandes, Andrew D. |
collection | PubMed |
description | Experimental variance is a major challenge when dealing with high-throughput sequencing data. This variance has several sources: sampling replication, technical replication, variability within biological conditions, and variability between biological conditions. The high per-sample cost of RNA-Seq often precludes the large number of experiments needed to partition observed variance into these categories as per standard ANOVA models. We show that the partitioning of within-condition to between-condition variation cannot reasonably be ignored, whether in single-organism RNA-Seq or in Meta-RNA-Seq experiments, and further find that commonly-used RNA-Seq analysis tools, as described in the literature, do not enforce the constraint that the sum of relative expression levels must be one, and thus report expression levels that are systematically distorted. These two factors lead to misleading inferences if not properly accommodated. As it is usually only the biological between-condition and within-condition differences that are of interest, we developed ALDEx, an ANOVA-like differential expression procedure, to identify genes with greater between- to within-condition differences. We show that the presence of differential expression and the magnitude of these comparative differences can be reasonably estimated with even very small sample sizes. |
format | Online Article Text |
id | pubmed-3699591 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-36995912013-07-10 ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq Fernandes, Andrew D. Macklaim, Jean M. Linn, Thomas G. Reid, Gregor Gloor, Gregory B. PLoS One Research Article Experimental variance is a major challenge when dealing with high-throughput sequencing data. This variance has several sources: sampling replication, technical replication, variability within biological conditions, and variability between biological conditions. The high per-sample cost of RNA-Seq often precludes the large number of experiments needed to partition observed variance into these categories as per standard ANOVA models. We show that the partitioning of within-condition to between-condition variation cannot reasonably be ignored, whether in single-organism RNA-Seq or in Meta-RNA-Seq experiments, and further find that commonly-used RNA-Seq analysis tools, as described in the literature, do not enforce the constraint that the sum of relative expression levels must be one, and thus report expression levels that are systematically distorted. These two factors lead to misleading inferences if not properly accommodated. As it is usually only the biological between-condition and within-condition differences that are of interest, we developed ALDEx, an ANOVA-like differential expression procedure, to identify genes with greater between- to within-condition differences. We show that the presence of differential expression and the magnitude of these comparative differences can be reasonably estimated with even very small sample sizes. Public Library of Science 2013-07-02 /pmc/articles/PMC3699591/ /pubmed/23843979 http://dx.doi.org/10.1371/journal.pone.0067019 Text en © 2013 Fernandes et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Fernandes, Andrew D. Macklaim, Jean M. Linn, Thomas G. Reid, Gregor Gloor, Gregory B. ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq |
title | ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq |
title_full | ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq |
title_fullStr | ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq |
title_full_unstemmed | ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq |
title_short | ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq |
title_sort | anova-like differential expression (aldex) analysis for mixed population rna-seq |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3699591/ https://www.ncbi.nlm.nih.gov/pubmed/23843979 http://dx.doi.org/10.1371/journal.pone.0067019 |
work_keys_str_mv | AT fernandesandrewd anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq AT macklaimjeanm anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq AT linnthomasg anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq AT reidgregor anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq AT gloorgregoryb anovalikedifferentialexpressionaldexanalysisformixedpopulationrnaseq |