Cargando…
A guide to creating design matrices for gene expression experiments
Differential expression analysis of genomic data types, such as RNA-sequencing experiments, use linear models to determine the size and direction of the changes in gene expression. For RNA-sequencing, there are several established software packages for this purpose accompanied with analysis pipeline...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7873980/ https://www.ncbi.nlm.nih.gov/pubmed/33604029 http://dx.doi.org/10.12688/f1000research.27893.1 |
_version_ | 1783649491190022144 |
---|---|
author | Law, Charity W. Zeglinski, Kathleen Dong, Xueyi Alhamdoosh, Monther Smyth, Gordon K. Ritchie, Matthew E. |
author_facet | Law, Charity W. Zeglinski, Kathleen Dong, Xueyi Alhamdoosh, Monther Smyth, Gordon K. Ritchie, Matthew E. |
author_sort | Law, Charity W. |
collection | PubMed |
description | Differential expression analysis of genomic data types, such as RNA-sequencing experiments, use linear models to determine the size and direction of the changes in gene expression. For RNA-sequencing, there are several established software packages for this purpose accompanied with analysis pipelines that are well described. However, there are two crucial steps in the analysis process that can be a stumbling block for many -- the set up an appropriate model via design matrices and the set up of comparisons of interest via contrast matrices. These steps are particularly troublesome because an extensive catalogue for design and contrast matrices does not currently exist. One would usually search for example case studies across different platforms and mix and match the advice from those sources to suit the dataset they have at hand. This article guides the reader through the basics of how to set up design and contrast matrices. We take a practical approach by providing code and graphical representation of each case study, starting with simpler examples (e.g. models with a single explanatory variable) and move onto more complex ones (e.g. interaction models, mixed effects models, higher order time series and cyclical models). Although our work has been written specifically with a limma-style pipeline in mind, most of it is also applicable to other software packages for differential expression analysis, and the ideas covered can be adapted to data analysis of other high-throughput technologies. Where appropriate, we explain the interpretation and differences between models to aid readers in their own model choices. Unnecessary jargon and theory is omitted where possible so that our work is accessible to a wide audience of readers, from beginners to those with experience in genomics data analysis. |
format | Online Article Text |
id | pubmed-7873980 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-78739802021-02-17 A guide to creating design matrices for gene expression experiments Law, Charity W. Zeglinski, Kathleen Dong, Xueyi Alhamdoosh, Monther Smyth, Gordon K. Ritchie, Matthew E. F1000Res Method Article Differential expression analysis of genomic data types, such as RNA-sequencing experiments, use linear models to determine the size and direction of the changes in gene expression. For RNA-sequencing, there are several established software packages for this purpose accompanied with analysis pipelines that are well described. However, there are two crucial steps in the analysis process that can be a stumbling block for many -- the set up an appropriate model via design matrices and the set up of comparisons of interest via contrast matrices. These steps are particularly troublesome because an extensive catalogue for design and contrast matrices does not currently exist. One would usually search for example case studies across different platforms and mix and match the advice from those sources to suit the dataset they have at hand. This article guides the reader through the basics of how to set up design and contrast matrices. We take a practical approach by providing code and graphical representation of each case study, starting with simpler examples (e.g. models with a single explanatory variable) and move onto more complex ones (e.g. interaction models, mixed effects models, higher order time series and cyclical models). Although our work has been written specifically with a limma-style pipeline in mind, most of it is also applicable to other software packages for differential expression analysis, and the ideas covered can be adapted to data analysis of other high-throughput technologies. Where appropriate, we explain the interpretation and differences between models to aid readers in their own model choices. Unnecessary jargon and theory is omitted where possible so that our work is accessible to a wide audience of readers, from beginners to those with experience in genomics data analysis. F1000 Research Limited 2020-12-10 /pmc/articles/PMC7873980/ /pubmed/33604029 http://dx.doi.org/10.12688/f1000research.27893.1 Text en Copyright: © 2020 Law CW et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Method Article Law, Charity W. Zeglinski, Kathleen Dong, Xueyi Alhamdoosh, Monther Smyth, Gordon K. Ritchie, Matthew E. A guide to creating design matrices for gene expression experiments |
title | A guide to creating design matrices for gene expression experiments |
title_full | A guide to creating design matrices for gene expression experiments |
title_fullStr | A guide to creating design matrices for gene expression experiments |
title_full_unstemmed | A guide to creating design matrices for gene expression experiments |
title_short | A guide to creating design matrices for gene expression experiments |
title_sort | guide to creating design matrices for gene expression experiments |
topic | Method Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7873980/ https://www.ncbi.nlm.nih.gov/pubmed/33604029 http://dx.doi.org/10.12688/f1000research.27893.1 |
work_keys_str_mv | AT lawcharityw aguidetocreatingdesignmatricesforgeneexpressionexperiments AT zeglinskikathleen aguidetocreatingdesignmatricesforgeneexpressionexperiments AT dongxueyi aguidetocreatingdesignmatricesforgeneexpressionexperiments AT alhamdooshmonther aguidetocreatingdesignmatricesforgeneexpressionexperiments AT smythgordonk aguidetocreatingdesignmatricesforgeneexpressionexperiments AT ritchiematthewe aguidetocreatingdesignmatricesforgeneexpressionexperiments AT lawcharityw guidetocreatingdesignmatricesforgeneexpressionexperiments AT zeglinskikathleen guidetocreatingdesignmatricesforgeneexpressionexperiments AT dongxueyi guidetocreatingdesignmatricesforgeneexpressionexperiments AT alhamdooshmonther guidetocreatingdesignmatricesforgeneexpressionexperiments AT smythgordonk guidetocreatingdesignmatricesforgeneexpressionexperiments AT ritchiematthewe guidetocreatingdesignmatricesforgeneexpressionexperiments |