Cargando…

An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study

RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we in...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Zichen, Ma'ayan, Avi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4972086/
https://www.ncbi.nlm.nih.gov/pubmed/27583132
http://dx.doi.org/10.12688/f1000research.9110.1
_version_ 1782446209416298496
author Wang, Zichen
Ma'ayan, Avi
author_facet Wang, Zichen
Ma'ayan, Avi
author_sort Wang, Zichen
collection PubMed
description RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at:  http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and  https://hub.docker.com/r/maayanlab/zika/.
format Online
Article
Text
id pubmed-4972086
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-49720862016-08-30 An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study Wang, Zichen Ma'ayan, Avi F1000Res Method Article RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at:  http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and  https://hub.docker.com/r/maayanlab/zika/. F1000Research 2016-07-05 /pmc/articles/PMC4972086/ /pubmed/27583132 http://dx.doi.org/10.12688/f1000research.9110.1 Text en Copyright: © 2016 Wang Z and Ma'ayan A http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Method Article
Wang, Zichen
Ma'ayan, Avi
An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
title An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
title_full An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
title_fullStr An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
title_full_unstemmed An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
title_short An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
title_sort open rna-seq data analysis pipeline tutorial with an example of reprocessing data from a recent zika virus study
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4972086/
https://www.ncbi.nlm.nih.gov/pubmed/27583132
http://dx.doi.org/10.12688/f1000research.9110.1
work_keys_str_mv AT wangzichen anopenrnaseqdataanalysispipelinetutorialwithanexampleofreprocessingdatafromarecentzikavirusstudy
AT maayanavi anopenrnaseqdataanalysispipelinetutorialwithanexampleofreprocessingdatafromarecentzikavirusstudy
AT wangzichen openrnaseqdataanalysispipelinetutorialwithanexampleofreprocessingdatafromarecentzikavirusstudy
AT maayanavi openrnaseqdataanalysispipelinetutorialwithanexampleofreprocessingdatafromarecentzikavirusstudy