Cargando…

QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization

BACKGROUND: RNA sequencing (RNA-seq), a next-generation sequencing technique for transcriptome profiling, is being increasingly used, in part driven by the decreasing cost of sequencing. Nevertheless, the analysis of the massive amounts of data generated by large-scale RNA-seq remains a challenge. M...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Shanrong, Xi, Li, Quan, Jie, Xi, Hualin, Zhang, Ying, von Schack, David, Vincent, Michael, Zhang, Baohong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4706714/
https://www.ncbi.nlm.nih.gov/pubmed/26747388
http://dx.doi.org/10.1186/s12864-015-2356-9
_version_ 1782409208699813888
author Zhao, Shanrong
Xi, Li
Quan, Jie
Xi, Hualin
Zhang, Ying
von Schack, David
Vincent, Michael
Zhang, Baohong
author_facet Zhao, Shanrong
Xi, Li
Quan, Jie
Xi, Hualin
Zhang, Ying
von Schack, David
Vincent, Michael
Zhang, Baohong
author_sort Zhao, Shanrong
collection PubMed
description BACKGROUND: RNA sequencing (RNA-seq), a next-generation sequencing technique for transcriptome profiling, is being increasingly used, in part driven by the decreasing cost of sequencing. Nevertheless, the analysis of the massive amounts of data generated by large-scale RNA-seq remains a challenge. Multiple algorithms pertinent to basic analyses have been developed, and there is an increasing need to automate the use of these tools so as to obtain results in an efficient and user friendly manner. Increased automation and improved visualization of the results will help make the results and findings of the analyses readily available to experimental scientists. RESULTS: By combing the best open source tools developed for RNA-seq data analyses and the most advanced web 2.0 technologies, we have implemented QuickRNASeq, a pipeline for large-scale RNA-seq data analyses and visualization. The QuickRNASeq workflow consists of three main steps. In Step #1, each individual sample is processed, including mapping RNA-seq reads to a reference genome, counting the numbers of mapped reads, quality control of the aligned reads, and SNP (single nucleotide polymorphism) calling. Step #1 is computationally intensive, and can be processed in parallel. In Step #2, the results from individual samples are merged, and an integrated and interactive project report is generated. All analyses results in the report are accessible via a single HTML entry webpage. Step #3 is the data interpretation and presentation step. The rich visualization features implemented here allow end users to interactively explore the results of RNA-seq data analyses, and to gain more insights into RNA-seq datasets. In addition, we used a real world dataset to demonstrate the simplicity and efficiency of QuickRNASeq in RNA-seq data analyses and interactive visualizations. The seamless integration of automated capabilites with interactive visualizations in QuickRNASeq is not available in other published RNA-seq pipelines. CONCLUSION: The high degree of automation and interactivity in QuickRNASeq leads to a substantial reduction in the time and effort required prior to further downstream analyses and interpretation of the analyses findings. QuickRNASeq advances primary RNA-seq data analyses to the next level of automation, and is mature for public release and adoption.
format Online
Article
Text
id pubmed-4706714
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47067142016-01-10 QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization Zhao, Shanrong Xi, Li Quan, Jie Xi, Hualin Zhang, Ying von Schack, David Vincent, Michael Zhang, Baohong BMC Genomics Software BACKGROUND: RNA sequencing (RNA-seq), a next-generation sequencing technique for transcriptome profiling, is being increasingly used, in part driven by the decreasing cost of sequencing. Nevertheless, the analysis of the massive amounts of data generated by large-scale RNA-seq remains a challenge. Multiple algorithms pertinent to basic analyses have been developed, and there is an increasing need to automate the use of these tools so as to obtain results in an efficient and user friendly manner. Increased automation and improved visualization of the results will help make the results and findings of the analyses readily available to experimental scientists. RESULTS: By combing the best open source tools developed for RNA-seq data analyses and the most advanced web 2.0 technologies, we have implemented QuickRNASeq, a pipeline for large-scale RNA-seq data analyses and visualization. The QuickRNASeq workflow consists of three main steps. In Step #1, each individual sample is processed, including mapping RNA-seq reads to a reference genome, counting the numbers of mapped reads, quality control of the aligned reads, and SNP (single nucleotide polymorphism) calling. Step #1 is computationally intensive, and can be processed in parallel. In Step #2, the results from individual samples are merged, and an integrated and interactive project report is generated. All analyses results in the report are accessible via a single HTML entry webpage. Step #3 is the data interpretation and presentation step. The rich visualization features implemented here allow end users to interactively explore the results of RNA-seq data analyses, and to gain more insights into RNA-seq datasets. In addition, we used a real world dataset to demonstrate the simplicity and efficiency of QuickRNASeq in RNA-seq data analyses and interactive visualizations. The seamless integration of automated capabilites with interactive visualizations in QuickRNASeq is not available in other published RNA-seq pipelines. CONCLUSION: The high degree of automation and interactivity in QuickRNASeq leads to a substantial reduction in the time and effort required prior to further downstream analyses and interpretation of the analyses findings. QuickRNASeq advances primary RNA-seq data analyses to the next level of automation, and is mature for public release and adoption. BioMed Central 2016-01-08 /pmc/articles/PMC4706714/ /pubmed/26747388 http://dx.doi.org/10.1186/s12864-015-2356-9 Text en © Zhao et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Zhao, Shanrong
Xi, Li
Quan, Jie
Xi, Hualin
Zhang, Ying
von Schack, David
Vincent, Michael
Zhang, Baohong
QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization
title QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization
title_full QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization
title_fullStr QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization
title_full_unstemmed QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization
title_short QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization
title_sort quickrnaseq lifts large-scale rna-seq data analyses to the next level of automation and interactive visualization
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4706714/
https://www.ncbi.nlm.nih.gov/pubmed/26747388
http://dx.doi.org/10.1186/s12864-015-2356-9
work_keys_str_mv AT zhaoshanrong quickrnaseqliftslargescalernaseqdataanalysestothenextlevelofautomationandinteractivevisualization
AT xili quickrnaseqliftslargescalernaseqdataanalysestothenextlevelofautomationandinteractivevisualization
AT quanjie quickrnaseqliftslargescalernaseqdataanalysestothenextlevelofautomationandinteractivevisualization
AT xihualin quickrnaseqliftslargescalernaseqdataanalysestothenextlevelofautomationandinteractivevisualization
AT zhangying quickrnaseqliftslargescalernaseqdataanalysestothenextlevelofautomationandinteractivevisualization
AT vonschackdavid quickrnaseqliftslargescalernaseqdataanalysestothenextlevelofautomationandinteractivevisualization
AT vincentmichael quickrnaseqliftslargescalernaseqdataanalysestothenextlevelofautomationandinteractivevisualization
AT zhangbaohong quickrnaseqliftslargescalernaseqdataanalysestothenextlevelofautomationandinteractivevisualization