Cargando…

Fast identification of differential distributions in single-cell RNA-sequencing data with waddR

MOTIVATION: Single-cell gene expression distributions measured by single-cell RNA-sequencing (scRNA-seq) often display complex differences between samples. These differences are biologically meaningful but cannot be identified using standard methods for differential expression. RESULTS: Here, we der...

Descripción completa

Detalles Bibliográficos
Autores principales: Schefzik, Roman, Flesch, Julian, Goncalves, Angela
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8504634/
https://www.ncbi.nlm.nih.gov/pubmed/33792651
http://dx.doi.org/10.1093/bioinformatics/btab226
_version_ 1784581359595945984
author Schefzik, Roman
Flesch, Julian
Goncalves, Angela
author_facet Schefzik, Roman
Flesch, Julian
Goncalves, Angela
author_sort Schefzik, Roman
collection PubMed
description MOTIVATION: Single-cell gene expression distributions measured by single-cell RNA-sequencing (scRNA-seq) often display complex differences between samples. These differences are biologically meaningful but cannot be identified using standard methods for differential expression. RESULTS: Here, we derive and implement a flexible and fast differential distribution testing procedure based on the 2-Wasserstein distance. Our method is able to detect any type of difference in distribution between conditions. To interpret distributional differences, we decompose the 2-Wasserstein distance into terms that capture the relative contribution of changes in mean, variance and shape to the overall difference. Finally, we derive mathematical generalizations that allow our method to be used in a broad range of disciplines other than scRNA-seq or bioinformatics. AVAILABILITY AND IMPLEMENTATION: Our methods are implemented in the R/Bioconductor package waddR, which is freely available at https://github.com/goncalves-lab/waddR, along with documentation and examples. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8504634
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-85046342021-10-13 Fast identification of differential distributions in single-cell RNA-sequencing data with waddR Schefzik, Roman Flesch, Julian Goncalves, Angela Bioinformatics Original Papers MOTIVATION: Single-cell gene expression distributions measured by single-cell RNA-sequencing (scRNA-seq) often display complex differences between samples. These differences are biologically meaningful but cannot be identified using standard methods for differential expression. RESULTS: Here, we derive and implement a flexible and fast differential distribution testing procedure based on the 2-Wasserstein distance. Our method is able to detect any type of difference in distribution between conditions. To interpret distributional differences, we decompose the 2-Wasserstein distance into terms that capture the relative contribution of changes in mean, variance and shape to the overall difference. Finally, we derive mathematical generalizations that allow our method to be used in a broad range of disciplines other than scRNA-seq or bioinformatics. AVAILABILITY AND IMPLEMENTATION: Our methods are implemented in the R/Bioconductor package waddR, which is freely available at https://github.com/goncalves-lab/waddR, along with documentation and examples. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-04-01 /pmc/articles/PMC8504634/ /pubmed/33792651 http://dx.doi.org/10.1093/bioinformatics/btab226 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Schefzik, Roman
Flesch, Julian
Goncalves, Angela
Fast identification of differential distributions in single-cell RNA-sequencing data with waddR
title Fast identification of differential distributions in single-cell RNA-sequencing data with waddR
title_full Fast identification of differential distributions in single-cell RNA-sequencing data with waddR
title_fullStr Fast identification of differential distributions in single-cell RNA-sequencing data with waddR
title_full_unstemmed Fast identification of differential distributions in single-cell RNA-sequencing data with waddR
title_short Fast identification of differential distributions in single-cell RNA-sequencing data with waddR
title_sort fast identification of differential distributions in single-cell rna-sequencing data with waddr
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8504634/
https://www.ncbi.nlm.nih.gov/pubmed/33792651
http://dx.doi.org/10.1093/bioinformatics/btab226
work_keys_str_mv AT schefzikroman fastidentificationofdifferentialdistributionsinsinglecellrnasequencingdatawithwaddr
AT fleschjulian fastidentificationofdifferentialdistributionsinsinglecellrnasequencingdatawithwaddr
AT goncalvesangela fastidentificationofdifferentialdistributionsinsinglecellrnasequencingdatawithwaddr