Cargando…
scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data
Single-cell RNA-sequencing (scRNA-seq) has been widely used for disease studies, where sample batches are collected from donors under different conditions including demographical groups, disease stages, and drug treatments. It is worth noting that the differences among sample batches in such a study...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10187255/ https://www.ncbi.nlm.nih.gov/pubmed/37205545 http://dx.doi.org/10.1101/2023.05.01.538975 |
_version_ | 1785042708852637696 |
---|---|
author | Zhang, Ziqi Zhao, Xinye Qiu, Peng Zhang, Xiuwei |
author_facet | Zhang, Ziqi Zhao, Xinye Qiu, Peng Zhang, Xiuwei |
author_sort | Zhang, Ziqi |
collection | PubMed |
description | Single-cell RNA-sequencing (scRNA-seq) has been widely used for disease studies, where sample batches are collected from donors under different conditions including demographical groups, disease stages, and drug treatments. It is worth noting that the differences among sample batches in such a study are a mixture of technical confounders caused by batch effect and the biological variations caused by condition effect. However, current batch effect removal methods often eliminate both technical batch effects and meaningful condition effects, while perturbation prediction methods solely focus on condition effects, resulting in inaccurate gene expression predictions due to unaccounted batch effects. Here we introduce scDisInFact, a deep learning framework that models both batch effect and condition effect in scRNA-seq data. scDisInFact learns latent factors that disentangle condition effects from batch effects, enabling it to simultaneously perform three tasks: batch effect removal, condition-associated key gene detection, and perturbation prediction. We evaluated scDisInFact on both simulated and real datasets, and compared its performance to baseline methods for each task. Our results demonstrate that scDisInFact outperforms existing methods that focus on individual tasks, providing a more comprehensive and accurate approach for integrating and predicting multi-batch multi-condition single-cell RNA-sequencing data. |
format | Online Article Text |
id | pubmed-10187255 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-101872552023-05-17 scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data Zhang, Ziqi Zhao, Xinye Qiu, Peng Zhang, Xiuwei bioRxiv Article Single-cell RNA-sequencing (scRNA-seq) has been widely used for disease studies, where sample batches are collected from donors under different conditions including demographical groups, disease stages, and drug treatments. It is worth noting that the differences among sample batches in such a study are a mixture of technical confounders caused by batch effect and the biological variations caused by condition effect. However, current batch effect removal methods often eliminate both technical batch effects and meaningful condition effects, while perturbation prediction methods solely focus on condition effects, resulting in inaccurate gene expression predictions due to unaccounted batch effects. Here we introduce scDisInFact, a deep learning framework that models both batch effect and condition effect in scRNA-seq data. scDisInFact learns latent factors that disentangle condition effects from batch effects, enabling it to simultaneously perform three tasks: batch effect removal, condition-associated key gene detection, and perturbation prediction. We evaluated scDisInFact on both simulated and real datasets, and compared its performance to baseline methods for each task. Our results demonstrate that scDisInFact outperforms existing methods that focus on individual tasks, providing a more comprehensive and accurate approach for integrating and predicting multi-batch multi-condition single-cell RNA-sequencing data. Cold Spring Harbor Laboratory 2023-05-02 /pmc/articles/PMC10187255/ /pubmed/37205545 http://dx.doi.org/10.1101/2023.05.01.538975 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Zhang, Ziqi Zhao, Xinye Qiu, Peng Zhang, Xiuwei scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data |
title | scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data |
title_full | scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data |
title_fullStr | scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data |
title_full_unstemmed | scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data |
title_short | scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data |
title_sort | scdisinfact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell rna-sequencing data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10187255/ https://www.ncbi.nlm.nih.gov/pubmed/37205545 http://dx.doi.org/10.1101/2023.05.01.538975 |
work_keys_str_mv | AT zhangziqi scdisinfactdisentangledlearningforintegrationandpredictionofmultibatchmulticonditionsinglecellrnasequencingdata AT zhaoxinye scdisinfactdisentangledlearningforintegrationandpredictionofmultibatchmulticonditionsinglecellrnasequencingdata AT qiupeng scdisinfactdisentangledlearningforintegrationandpredictionofmultibatchmulticonditionsinglecellrnasequencingdata AT zhangxiuwei scdisinfactdisentangledlearningforintegrationandpredictionofmultibatchmulticonditionsinglecellrnasequencingdata |