Cargando…

A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor

Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells. This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity. The differences between scRNA-seq and bulk RNA-seq...

Descripción completa

Detalles Bibliográficos
Autores principales: Lun, Aaron T.L., McCarthy, Davis J., Marioni, John C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5112579/
https://www.ncbi.nlm.nih.gov/pubmed/27909575
http://dx.doi.org/10.12688/f1000research.9501.2
_version_ 1782468029082238976
author Lun, Aaron T.L.
McCarthy, Davis J.
Marioni, John C.
author_facet Lun, Aaron T.L.
McCarthy, Davis J.
Marioni, John C.
author_sort Lun, Aaron T.L.
collection PubMed
description Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells. This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity. The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter. Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise. This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project. It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection. Analyses were demonstrated on gene-level count data from several publicly available datasets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells. This will provide a range of usage scenarios from which readers can construct their own analysis pipelines.
format Online
Article
Text
id pubmed-5112579
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-51125792016-11-30 A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor Lun, Aaron T.L. McCarthy, Davis J. Marioni, John C. F1000Res Software Tool Article Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells. This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity. The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter. Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise. This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project. It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection. Analyses were demonstrated on gene-level count data from several publicly available datasets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells. This will provide a range of usage scenarios from which readers can construct their own analysis pipelines. F1000Research 2016-10-31 /pmc/articles/PMC5112579/ /pubmed/27909575 http://dx.doi.org/10.12688/f1000research.9501.2 Text en Copyright: © 2016 Lun ATL et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
Lun, Aaron T.L.
McCarthy, Davis J.
Marioni, John C.
A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor
title A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor
title_full A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor
title_fullStr A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor
title_full_unstemmed A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor
title_short A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor
title_sort step-by-step workflow for low-level analysis of single-cell rna-seq data with bioconductor
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5112579/
https://www.ncbi.nlm.nih.gov/pubmed/27909575
http://dx.doi.org/10.12688/f1000research.9501.2
work_keys_str_mv AT lunaarontl astepbystepworkflowforlowlevelanalysisofsinglecellrnaseqdatawithbioconductor
AT mccarthydavisj astepbystepworkflowforlowlevelanalysisofsinglecellrnaseqdatawithbioconductor
AT marionijohnc astepbystepworkflowforlowlevelanalysisofsinglecellrnaseqdatawithbioconductor
AT lunaarontl stepbystepworkflowforlowlevelanalysisofsinglecellrnaseqdatawithbioconductor
AT mccarthydavisj stepbystepworkflowforlowlevelanalysisofsinglecellrnaseqdatawithbioconductor
AT marionijohnc stepbystepworkflowforlowlevelanalysisofsinglecellrnaseqdatawithbioconductor