Cargando…

Efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet

Advances in single-cell and -nucleus transcriptomics have enabled generation of increasingly large-scale datasets from hundreds of subjects and millions of cells. These studies promise to give unprecedented insight into the cell type specific biology of human disease. Yet performing differential exp...

Descripción completa

Detalles Bibliográficos
Autores principales: Hoffman, Gabriel E., Lee, Donghoon, Bendl, Jaroslav, Fnu, Prashant, Hong, Aram, Casey, Clara, Alvia, Marcela, Shao, Zhiping, Argyriou, Stathis, Therrien, Karen, Venkatesh, Sanan, Voloudakis, Georgios, Haroutunian, Vahram, Fullard, John F., Roussos, Panos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055252/
https://www.ncbi.nlm.nih.gov/pubmed/36993704
http://dx.doi.org/10.1101/2023.03.17.533005
_version_ 1785015845784649728
author Hoffman, Gabriel E.
Lee, Donghoon
Bendl, Jaroslav
Fnu, Prashant
Hong, Aram
Casey, Clara
Alvia, Marcela
Shao, Zhiping
Argyriou, Stathis
Therrien, Karen
Venkatesh, Sanan
Voloudakis, Georgios
Haroutunian, Vahram
Fullard, John F.
Roussos, Panos
author_facet Hoffman, Gabriel E.
Lee, Donghoon
Bendl, Jaroslav
Fnu, Prashant
Hong, Aram
Casey, Clara
Alvia, Marcela
Shao, Zhiping
Argyriou, Stathis
Therrien, Karen
Venkatesh, Sanan
Voloudakis, Georgios
Haroutunian, Vahram
Fullard, John F.
Roussos, Panos
author_sort Hoffman, Gabriel E.
collection PubMed
description Advances in single-cell and -nucleus transcriptomics have enabled generation of increasingly large-scale datasets from hundreds of subjects and millions of cells. These studies promise to give unprecedented insight into the cell type specific biology of human disease. Yet performing differential expression analyses across subjects remains difficult due to challenges in statistical modeling of these complex studies and scaling analyses to large datasets. Our open-source R package dreamlet (DiseaseNeurogenomics.github.io/dreamlet) uses a pseudobulk approach based on precision-weighted linear mixed models to identify genes differentially expressed with traits across subjects for each cell cluster. Designed for data from large cohorts, dreamlet is substantially faster and uses less memory than existing workflows, while supporting complex statistical models and controlling the false positive rate. We demonstrate computational and statistical performance on published datasets, and a novel dataset of 1.4M single nuclei from postmortem brains of 150 Alzheimer’s disease cases and 149 controls.
format Online
Article
Text
id pubmed-10055252
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-100552522023-03-30 Efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet Hoffman, Gabriel E. Lee, Donghoon Bendl, Jaroslav Fnu, Prashant Hong, Aram Casey, Clara Alvia, Marcela Shao, Zhiping Argyriou, Stathis Therrien, Karen Venkatesh, Sanan Voloudakis, Georgios Haroutunian, Vahram Fullard, John F. Roussos, Panos bioRxiv Article Advances in single-cell and -nucleus transcriptomics have enabled generation of increasingly large-scale datasets from hundreds of subjects and millions of cells. These studies promise to give unprecedented insight into the cell type specific biology of human disease. Yet performing differential expression analyses across subjects remains difficult due to challenges in statistical modeling of these complex studies and scaling analyses to large datasets. Our open-source R package dreamlet (DiseaseNeurogenomics.github.io/dreamlet) uses a pseudobulk approach based on precision-weighted linear mixed models to identify genes differentially expressed with traits across subjects for each cell cluster. Designed for data from large cohorts, dreamlet is substantially faster and uses less memory than existing workflows, while supporting complex statistical models and controlling the false positive rate. We demonstrate computational and statistical performance on published datasets, and a novel dataset of 1.4M single nuclei from postmortem brains of 150 Alzheimer’s disease cases and 149 controls. Cold Spring Harbor Laboratory 2023-03-20 /pmc/articles/PMC10055252/ /pubmed/36993704 http://dx.doi.org/10.1101/2023.03.17.533005 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Hoffman, Gabriel E.
Lee, Donghoon
Bendl, Jaroslav
Fnu, Prashant
Hong, Aram
Casey, Clara
Alvia, Marcela
Shao, Zhiping
Argyriou, Stathis
Therrien, Karen
Venkatesh, Sanan
Voloudakis, Georgios
Haroutunian, Vahram
Fullard, John F.
Roussos, Panos
Efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet
title Efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet
title_full Efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet
title_fullStr Efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet
title_full_unstemmed Efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet
title_short Efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet
title_sort efficient differential expression analysis of large-scale single cell transcriptomics data using dreamlet
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055252/
https://www.ncbi.nlm.nih.gov/pubmed/36993704
http://dx.doi.org/10.1101/2023.03.17.533005
work_keys_str_mv AT hoffmangabriele efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT leedonghoon efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT bendljaroslav efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT fnuprashant efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT hongaram efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT caseyclara efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT alviamarcela efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT shaozhiping efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT argyrioustathis efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT therrienkaren efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT venkateshsanan efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT voloudakisgeorgios efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT haroutunianvahram efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT fullardjohnf efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet
AT roussospanos efficientdifferentialexpressionanalysisoflargescalesinglecelltranscriptomicsdatausingdreamlet