Cargando…

scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data

Single-cell RNA-sequencing (scRNA-seq) has been widely used for disease studies, where sample batches are collected from donors under different conditions including demographical groups, disease stages, and drug treatments. It is worth noting that the differences among sample batches in such a study...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Ziqi, Zhao, Xinye, Qiu, Peng, Zhang, Xiuwei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10187255/
https://www.ncbi.nlm.nih.gov/pubmed/37205545
http://dx.doi.org/10.1101/2023.05.01.538975
_version_ 1785042708852637696
author Zhang, Ziqi
Zhao, Xinye
Qiu, Peng
Zhang, Xiuwei
author_facet Zhang, Ziqi
Zhao, Xinye
Qiu, Peng
Zhang, Xiuwei
author_sort Zhang, Ziqi
collection PubMed
description Single-cell RNA-sequencing (scRNA-seq) has been widely used for disease studies, where sample batches are collected from donors under different conditions including demographical groups, disease stages, and drug treatments. It is worth noting that the differences among sample batches in such a study are a mixture of technical confounders caused by batch effect and the biological variations caused by condition effect. However, current batch effect removal methods often eliminate both technical batch effects and meaningful condition effects, while perturbation prediction methods solely focus on condition effects, resulting in inaccurate gene expression predictions due to unaccounted batch effects. Here we introduce scDisInFact, a deep learning framework that models both batch effect and condition effect in scRNA-seq data. scDisInFact learns latent factors that disentangle condition effects from batch effects, enabling it to simultaneously perform three tasks: batch effect removal, condition-associated key gene detection, and perturbation prediction. We evaluated scDisInFact on both simulated and real datasets, and compared its performance to baseline methods for each task. Our results demonstrate that scDisInFact outperforms existing methods that focus on individual tasks, providing a more comprehensive and accurate approach for integrating and predicting multi-batch multi-condition single-cell RNA-sequencing data.
format Online
Article
Text
id pubmed-10187255
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-101872552023-05-17 scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data Zhang, Ziqi Zhao, Xinye Qiu, Peng Zhang, Xiuwei bioRxiv Article Single-cell RNA-sequencing (scRNA-seq) has been widely used for disease studies, where sample batches are collected from donors under different conditions including demographical groups, disease stages, and drug treatments. It is worth noting that the differences among sample batches in such a study are a mixture of technical confounders caused by batch effect and the biological variations caused by condition effect. However, current batch effect removal methods often eliminate both technical batch effects and meaningful condition effects, while perturbation prediction methods solely focus on condition effects, resulting in inaccurate gene expression predictions due to unaccounted batch effects. Here we introduce scDisInFact, a deep learning framework that models both batch effect and condition effect in scRNA-seq data. scDisInFact learns latent factors that disentangle condition effects from batch effects, enabling it to simultaneously perform three tasks: batch effect removal, condition-associated key gene detection, and perturbation prediction. We evaluated scDisInFact on both simulated and real datasets, and compared its performance to baseline methods for each task. Our results demonstrate that scDisInFact outperforms existing methods that focus on individual tasks, providing a more comprehensive and accurate approach for integrating and predicting multi-batch multi-condition single-cell RNA-sequencing data. Cold Spring Harbor Laboratory 2023-05-02 /pmc/articles/PMC10187255/ /pubmed/37205545 http://dx.doi.org/10.1101/2023.05.01.538975 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Zhang, Ziqi
Zhao, Xinye
Qiu, Peng
Zhang, Xiuwei
scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data
title scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data
title_full scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data
title_fullStr scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data
title_full_unstemmed scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data
title_short scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data
title_sort scdisinfact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell rna-sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10187255/
https://www.ncbi.nlm.nih.gov/pubmed/37205545
http://dx.doi.org/10.1101/2023.05.01.538975
work_keys_str_mv AT zhangziqi scdisinfactdisentangledlearningforintegrationandpredictionofmultibatchmulticonditionsinglecellrnasequencingdata
AT zhaoxinye scdisinfactdisentangledlearningforintegrationandpredictionofmultibatchmulticonditionsinglecellrnasequencingdata
AT qiupeng scdisinfactdisentangledlearningforintegrationandpredictionofmultibatchmulticonditionsinglecellrnasequencingdata
AT zhangxiuwei scdisinfactdisentangledlearningforintegrationandpredictionofmultibatchmulticonditionsinglecellrnasequencingdata