Cargando…

Statistical methods for analysis of single-cell RNA-sequencing data

Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput genomic technology used to study the expression dynamics of genes at single-cell level. Analyzing the scRNA-seq data in presence of biological confounding factors including dropout events is a challenging task. Thus, this article pre...

Descripción completa

Detalles Bibliográficos
Autores principales: Das, Samarendra, Rai, Shesh N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8720898/
https://www.ncbi.nlm.nih.gov/pubmed/35004214
http://dx.doi.org/10.1016/j.mex.2021.101580
_version_ 1784625224087502848
author Das, Samarendra
Rai, Shesh N.
author_facet Das, Samarendra
Rai, Shesh N.
author_sort Das, Samarendra
collection PubMed
description Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput genomic technology used to study the expression dynamics of genes at single-cell level. Analyzing the scRNA-seq data in presence of biological confounding factors including dropout events is a challenging task. Thus, this article presents a novel statistical approach for various analyses of the scRNA-seq Unique Molecular Identifier (UMI) counts data. The various analyses include modeling and fitting of observed UMI data, cell type detection, estimation of cell capture rates, estimation of gene specific model parameters, estimation of the sample mean and sample variance of the genes, etc. Besides, the developed approach is able to perform differential expression, and other downstream analyses that consider the molecular capture process in scRNA-seq data modeling. Here, the external spike-ins data can also be used in the approach for better results. The unique feature of the method is that it considers the biological process that leads to severe dropout events in modeling the observed UMI counts of genes. • The differential expression analysis of observed scRNA-seq UMI counts data is performed after adjustment for cell capture rates. • The statistical approach performs downstream differential zero inflation analysis, classification of influential genes, and selection of top marker genes. • Cell auxiliaries including cell clusters and other cell variables (e.g., cell cycle, cell phase) are used to remove unwanted variation to perform statistical tests reliably.
format Online
Article
Text
id pubmed-8720898
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-87208982022-01-07 Statistical methods for analysis of single-cell RNA-sequencing data Das, Samarendra Rai, Shesh N. MethodsX Method Article Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput genomic technology used to study the expression dynamics of genes at single-cell level. Analyzing the scRNA-seq data in presence of biological confounding factors including dropout events is a challenging task. Thus, this article presents a novel statistical approach for various analyses of the scRNA-seq Unique Molecular Identifier (UMI) counts data. The various analyses include modeling and fitting of observed UMI data, cell type detection, estimation of cell capture rates, estimation of gene specific model parameters, estimation of the sample mean and sample variance of the genes, etc. Besides, the developed approach is able to perform differential expression, and other downstream analyses that consider the molecular capture process in scRNA-seq data modeling. Here, the external spike-ins data can also be used in the approach for better results. The unique feature of the method is that it considers the biological process that leads to severe dropout events in modeling the observed UMI counts of genes. • The differential expression analysis of observed scRNA-seq UMI counts data is performed after adjustment for cell capture rates. • The statistical approach performs downstream differential zero inflation analysis, classification of influential genes, and selection of top marker genes. • Cell auxiliaries including cell clusters and other cell variables (e.g., cell cycle, cell phase) are used to remove unwanted variation to perform statistical tests reliably. Elsevier 2021-11-17 /pmc/articles/PMC8720898/ /pubmed/35004214 http://dx.doi.org/10.1016/j.mex.2021.101580 Text en Published by Elsevier B.V. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Method Article
Das, Samarendra
Rai, Shesh N.
Statistical methods for analysis of single-cell RNA-sequencing data
title Statistical methods for analysis of single-cell RNA-sequencing data
title_full Statistical methods for analysis of single-cell RNA-sequencing data
title_fullStr Statistical methods for analysis of single-cell RNA-sequencing data
title_full_unstemmed Statistical methods for analysis of single-cell RNA-sequencing data
title_short Statistical methods for analysis of single-cell RNA-sequencing data
title_sort statistical methods for analysis of single-cell rna-sequencing data
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8720898/
https://www.ncbi.nlm.nih.gov/pubmed/35004214
http://dx.doi.org/10.1016/j.mex.2021.101580
work_keys_str_mv AT dassamarendra statisticalmethodsforanalysisofsinglecellrnasequencingdata
AT raisheshn statisticalmethodsforanalysisofsinglecellrnasequencingdata