Cargando…

SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data

BACKGROUND: Several R packages exist for the detection of differentially expressed genes from RNA-Seq data. The analysis process includes three main steps, namely normalization, dispersion estimation and test for differential expression. Quality control steps along this process are recommended but n...

Descripción completa

Detalles Bibliográficos
Autores principales: Varet, Hugo, Brillet-Guéguen, Loraine, Coppée, Jean-Yves, Dillies, Marie-Agnès
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4900645/
https://www.ncbi.nlm.nih.gov/pubmed/27280887
http://dx.doi.org/10.1371/journal.pone.0157022
_version_ 1782436677553225728
author Varet, Hugo
Brillet-Guéguen, Loraine
Coppée, Jean-Yves
Dillies, Marie-Agnès
author_facet Varet, Hugo
Brillet-Guéguen, Loraine
Coppée, Jean-Yves
Dillies, Marie-Agnès
author_sort Varet, Hugo
collection PubMed
description BACKGROUND: Several R packages exist for the detection of differentially expressed genes from RNA-Seq data. The analysis process includes three main steps, namely normalization, dispersion estimation and test for differential expression. Quality control steps along this process are recommended but not mandatory, and failing to check the characteristics of the dataset may lead to spurious results. In addition, normalization methods and statistical models are not exchangeable across the packages without adequate transformations the users are often not aware of. Thus, dedicated analysis pipelines are needed to include systematic quality control steps and prevent errors from misusing the proposed methods. RESULTS: SARTools is an R pipeline for differential analysis of RNA-Seq count data. It can handle designs involving two or more conditions of a single biological factor with or without a blocking factor (such as a batch effect or a sample pairing). It is based on DESeq2 and edgeR and is composed of an R package and two R script templates (for DESeq2 and edgeR respectively). Tuning a small number of parameters and executing one of the R scripts, users have access to the full results of the analysis, including lists of differentially expressed genes and a HTML report that (i) displays diagnostic plots for quality control and model hypotheses checking and (ii) keeps track of the whole analysis process, parameter values and versions of the R packages used. CONCLUSIONS: SARTools provides systematic quality controls of the dataset as well as diagnostic plots that help to tune the model parameters. It gives access to the main parameters of DESeq2 and edgeR and prevents untrained users from misusing some functionalities of both packages. By keeping track of all the parameters of the analysis process it fits the requirements of reproducible research.
format Online
Article
Text
id pubmed-4900645
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-49006452016-06-24 SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data Varet, Hugo Brillet-Guéguen, Loraine Coppée, Jean-Yves Dillies, Marie-Agnès PLoS One Research Article BACKGROUND: Several R packages exist for the detection of differentially expressed genes from RNA-Seq data. The analysis process includes three main steps, namely normalization, dispersion estimation and test for differential expression. Quality control steps along this process are recommended but not mandatory, and failing to check the characteristics of the dataset may lead to spurious results. In addition, normalization methods and statistical models are not exchangeable across the packages without adequate transformations the users are often not aware of. Thus, dedicated analysis pipelines are needed to include systematic quality control steps and prevent errors from misusing the proposed methods. RESULTS: SARTools is an R pipeline for differential analysis of RNA-Seq count data. It can handle designs involving two or more conditions of a single biological factor with or without a blocking factor (such as a batch effect or a sample pairing). It is based on DESeq2 and edgeR and is composed of an R package and two R script templates (for DESeq2 and edgeR respectively). Tuning a small number of parameters and executing one of the R scripts, users have access to the full results of the analysis, including lists of differentially expressed genes and a HTML report that (i) displays diagnostic plots for quality control and model hypotheses checking and (ii) keeps track of the whole analysis process, parameter values and versions of the R packages used. CONCLUSIONS: SARTools provides systematic quality controls of the dataset as well as diagnostic plots that help to tune the model parameters. It gives access to the main parameters of DESeq2 and edgeR and prevents untrained users from misusing some functionalities of both packages. By keeping track of all the parameters of the analysis process it fits the requirements of reproducible research. Public Library of Science 2016-06-09 /pmc/articles/PMC4900645/ /pubmed/27280887 http://dx.doi.org/10.1371/journal.pone.0157022 Text en © 2016 Varet et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Varet, Hugo
Brillet-Guéguen, Loraine
Coppée, Jean-Yves
Dillies, Marie-Agnès
SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data
title SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data
title_full SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data
title_fullStr SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data
title_full_unstemmed SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data
title_short SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data
title_sort sartools: a deseq2- and edger-based r pipeline for comprehensive differential analysis of rna-seq data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4900645/
https://www.ncbi.nlm.nih.gov/pubmed/27280887
http://dx.doi.org/10.1371/journal.pone.0157022
work_keys_str_mv AT varethugo sartoolsadeseq2andedgerbasedrpipelineforcomprehensivedifferentialanalysisofrnaseqdata
AT brilletgueguenloraine sartoolsadeseq2andedgerbasedrpipelineforcomprehensivedifferentialanalysisofrnaseqdata
AT coppeejeanyves sartoolsadeseq2andedgerbasedrpipelineforcomprehensivedifferentialanalysisofrnaseqdata
AT dilliesmarieagnes sartoolsadeseq2andedgerbasedrpipelineforcomprehensivedifferentialanalysisofrnaseqdata