Cargando…
An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we in...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000Research
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4972086/ https://www.ncbi.nlm.nih.gov/pubmed/27583132 http://dx.doi.org/10.12688/f1000research.9110.1 |
_version_ | 1782446209416298496 |
---|---|
author | Wang, Zichen Ma'ayan, Avi |
author_facet | Wang, Zichen Ma'ayan, Avi |
author_sort | Wang, Zichen |
collection | PubMed |
description | RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at: http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and https://hub.docker.com/r/maayanlab/zika/. |
format | Online Article Text |
id | pubmed-4972086 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | F1000Research |
record_format | MEDLINE/PubMed |
spelling | pubmed-49720862016-08-30 An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study Wang, Zichen Ma'ayan, Avi F1000Res Method Article RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at: http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and https://hub.docker.com/r/maayanlab/zika/. F1000Research 2016-07-05 /pmc/articles/PMC4972086/ /pubmed/27583132 http://dx.doi.org/10.12688/f1000research.9110.1 Text en Copyright: © 2016 Wang Z and Ma'ayan A http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Method Article Wang, Zichen Ma'ayan, Avi An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study |
title | An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study |
title_full | An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study |
title_fullStr | An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study |
title_full_unstemmed | An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study |
title_short | An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study |
title_sort | open rna-seq data analysis pipeline tutorial with an example of reprocessing data from a recent zika virus study |
topic | Method Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4972086/ https://www.ncbi.nlm.nih.gov/pubmed/27583132 http://dx.doi.org/10.12688/f1000research.9110.1 |
work_keys_str_mv | AT wangzichen anopenrnaseqdataanalysispipelinetutorialwithanexampleofreprocessingdatafromarecentzikavirusstudy AT maayanavi anopenrnaseqdataanalysispipelinetutorialwithanexampleofreprocessingdatafromarecentzikavirusstudy AT wangzichen openrnaseqdataanalysispipelinetutorialwithanexampleofreprocessingdatafromarecentzikavirusstudy AT maayanavi openrnaseqdataanalysispipelinetutorialwithanexampleofreprocessingdatafromarecentzikavirusstudy |