Cargando…

Single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss

BACKGROUND: Single-cell omics technology is rapidly developing to measure the epigenome, genome, and transcriptome across a range of cell types. However, it is still challenging to integrate omics data from different modalities. Here, we propose a variation of the Siamese neural network framework ca...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Chaozhong, Wang, Linhua, Liu, Zhandong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9812356/
https://www.ncbi.nlm.nih.gov/pubmed/36600199
http://dx.doi.org/10.1186/s12859-022-05126-7
_version_ 1784863709069312000
author Liu, Chaozhong
Wang, Linhua
Liu, Zhandong
author_facet Liu, Chaozhong
Wang, Linhua
Liu, Zhandong
author_sort Liu, Chaozhong
collection PubMed
description BACKGROUND: Single-cell omics technology is rapidly developing to measure the epigenome, genome, and transcriptome across a range of cell types. However, it is still challenging to integrate omics data from different modalities. Here, we propose a variation of the Siamese neural network framework called MinNet, which is trained to integrate multi-omics data on the single-cell resolution by using graph-based contrastive loss. RESULTS: By training the model and testing it on several benchmark datasets, we showed its accuracy and generalizability in integrating scRNA-seq with scATAC-seq, and scRNA-seq with epitope data. Further evaluation demonstrated our model's unique ability to remove the batch effect, a common problem in actual practice. To show how the integration impacts downstream analysis, we established model-based smoothing and cis-regulatory element-inferring method and validated it with external pcHi-C evidence. Finally, we applied the framework to a COVID-19 dataset to bolster the original work with integration-based analysis, showing its necessity in single-cell multi-omics research. CONCLUSIONS: MinNet is a novel deep-learning framework for single-cell multi-omics sequencing data integration. It ranked top among other methods in benchmarking and is especially suitable for integrating datasets with batch and biological variances. With the single-cell resolution integration results, analysis of the interplay between genome and transcriptome can be done to help researchers understand their data and question. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05126-7.
format Online
Article
Text
id pubmed-9812356
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-98123562023-01-04 Single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss Liu, Chaozhong Wang, Linhua Liu, Zhandong BMC Bioinformatics Research Article BACKGROUND: Single-cell omics technology is rapidly developing to measure the epigenome, genome, and transcriptome across a range of cell types. However, it is still challenging to integrate omics data from different modalities. Here, we propose a variation of the Siamese neural network framework called MinNet, which is trained to integrate multi-omics data on the single-cell resolution by using graph-based contrastive loss. RESULTS: By training the model and testing it on several benchmark datasets, we showed its accuracy and generalizability in integrating scRNA-seq with scATAC-seq, and scRNA-seq with epitope data. Further evaluation demonstrated our model's unique ability to remove the batch effect, a common problem in actual practice. To show how the integration impacts downstream analysis, we established model-based smoothing and cis-regulatory element-inferring method and validated it with external pcHi-C evidence. Finally, we applied the framework to a COVID-19 dataset to bolster the original work with integration-based analysis, showing its necessity in single-cell multi-omics research. CONCLUSIONS: MinNet is a novel deep-learning framework for single-cell multi-omics sequencing data integration. It ranked top among other methods in benchmarking and is especially suitable for integrating datasets with batch and biological variances. With the single-cell resolution integration results, analysis of the interplay between genome and transcriptome can be done to help researchers understand their data and question. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05126-7. BioMed Central 2023-01-04 /pmc/articles/PMC9812356/ /pubmed/36600199 http://dx.doi.org/10.1186/s12859-022-05126-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Liu, Chaozhong
Wang, Linhua
Liu, Zhandong
Single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss
title Single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss
title_full Single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss
title_fullStr Single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss
title_full_unstemmed Single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss
title_short Single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss
title_sort single-cell multi-omics integration for unpaired data by a siamese network with graph-based contrastive loss
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9812356/
https://www.ncbi.nlm.nih.gov/pubmed/36600199
http://dx.doi.org/10.1186/s12859-022-05126-7
work_keys_str_mv AT liuchaozhong singlecellmultiomicsintegrationforunpaireddatabyasiamesenetworkwithgraphbasedcontrastiveloss
AT wanglinhua singlecellmultiomicsintegrationforunpaireddatabyasiamesenetworkwithgraphbasedcontrastiveloss
AT liuzhandong singlecellmultiomicsintegrationforunpaireddatabyasiamesenetworkwithgraphbasedcontrastiveloss