Cargando…
Statistical methods for analysis of single-cell RNA-sequencing data
Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput genomic technology used to study the expression dynamics of genes at single-cell level. Analyzing the scRNA-seq data in presence of biological confounding factors including dropout events is a challenging task. Thus, this article pre...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8720898/ https://www.ncbi.nlm.nih.gov/pubmed/35004214 http://dx.doi.org/10.1016/j.mex.2021.101580 |
_version_ | 1784625224087502848 |
---|---|
author | Das, Samarendra Rai, Shesh N. |
author_facet | Das, Samarendra Rai, Shesh N. |
author_sort | Das, Samarendra |
collection | PubMed |
description | Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput genomic technology used to study the expression dynamics of genes at single-cell level. Analyzing the scRNA-seq data in presence of biological confounding factors including dropout events is a challenging task. Thus, this article presents a novel statistical approach for various analyses of the scRNA-seq Unique Molecular Identifier (UMI) counts data. The various analyses include modeling and fitting of observed UMI data, cell type detection, estimation of cell capture rates, estimation of gene specific model parameters, estimation of the sample mean and sample variance of the genes, etc. Besides, the developed approach is able to perform differential expression, and other downstream analyses that consider the molecular capture process in scRNA-seq data modeling. Here, the external spike-ins data can also be used in the approach for better results. The unique feature of the method is that it considers the biological process that leads to severe dropout events in modeling the observed UMI counts of genes. • The differential expression analysis of observed scRNA-seq UMI counts data is performed after adjustment for cell capture rates. • The statistical approach performs downstream differential zero inflation analysis, classification of influential genes, and selection of top marker genes. • Cell auxiliaries including cell clusters and other cell variables (e.g., cell cycle, cell phase) are used to remove unwanted variation to perform statistical tests reliably. |
format | Online Article Text |
id | pubmed-8720898 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-87208982022-01-07 Statistical methods for analysis of single-cell RNA-sequencing data Das, Samarendra Rai, Shesh N. MethodsX Method Article Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput genomic technology used to study the expression dynamics of genes at single-cell level. Analyzing the scRNA-seq data in presence of biological confounding factors including dropout events is a challenging task. Thus, this article presents a novel statistical approach for various analyses of the scRNA-seq Unique Molecular Identifier (UMI) counts data. The various analyses include modeling and fitting of observed UMI data, cell type detection, estimation of cell capture rates, estimation of gene specific model parameters, estimation of the sample mean and sample variance of the genes, etc. Besides, the developed approach is able to perform differential expression, and other downstream analyses that consider the molecular capture process in scRNA-seq data modeling. Here, the external spike-ins data can also be used in the approach for better results. The unique feature of the method is that it considers the biological process that leads to severe dropout events in modeling the observed UMI counts of genes. • The differential expression analysis of observed scRNA-seq UMI counts data is performed after adjustment for cell capture rates. • The statistical approach performs downstream differential zero inflation analysis, classification of influential genes, and selection of top marker genes. • Cell auxiliaries including cell clusters and other cell variables (e.g., cell cycle, cell phase) are used to remove unwanted variation to perform statistical tests reliably. Elsevier 2021-11-17 /pmc/articles/PMC8720898/ /pubmed/35004214 http://dx.doi.org/10.1016/j.mex.2021.101580 Text en Published by Elsevier B.V. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Method Article Das, Samarendra Rai, Shesh N. Statistical methods for analysis of single-cell RNA-sequencing data |
title | Statistical methods for analysis of single-cell RNA-sequencing data |
title_full | Statistical methods for analysis of single-cell RNA-sequencing data |
title_fullStr | Statistical methods for analysis of single-cell RNA-sequencing data |
title_full_unstemmed | Statistical methods for analysis of single-cell RNA-sequencing data |
title_short | Statistical methods for analysis of single-cell RNA-sequencing data |
title_sort | statistical methods for analysis of single-cell rna-sequencing data |
topic | Method Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8720898/ https://www.ncbi.nlm.nih.gov/pubmed/35004214 http://dx.doi.org/10.1016/j.mex.2021.101580 |
work_keys_str_mv | AT dassamarendra statisticalmethodsforanalysisofsinglecellrnasequencingdata AT raisheshn statisticalmethodsforanalysisofsinglecellrnasequencingdata |