Cargando…
Investigating the Complexity of Gene Co-expression Estimation for Single-cell Data
With the rapid advance of single-cell RNA sequencing (scRNA-seq) technology, understanding biological processes at a more refined single-cell level is becoming possible. Gene co-expression estimation is an essential step in this direction. It can annotate functionalities of unknown genes or construc...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9900775/ https://www.ncbi.nlm.nih.gov/pubmed/36747724 http://dx.doi.org/10.1101/2023.01.24.525447 |
_version_ | 1784882916227022848 |
---|---|
author | Zhang, Jiaqi Singh, Ritambhara |
author_facet | Zhang, Jiaqi Singh, Ritambhara |
author_sort | Zhang, Jiaqi |
collection | PubMed |
description | With the rapid advance of single-cell RNA sequencing (scRNA-seq) technology, understanding biological processes at a more refined single-cell level is becoming possible. Gene co-expression estimation is an essential step in this direction. It can annotate functionalities of unknown genes or construct the basis of gene regulatory network inference. This study thoroughly tests the existing gene co-expression estimation methods on simulation datasets with known ground truth co-expression networks. We generate these novel datasets using two simulation processes that use the parameters learned from the experimental data. We demonstrate that these simulations better capture the underlying properties of the real-world single-cell datasets than previously tested simulations for the task. Our performance results on tens of simulated and eight experimental datasets show that all methods produce estimations with a high false discovery rate potentially caused by high-sparsity levels in the data. Finally, we find that commonly used pre-processing approaches, such as normalization and imputation, do not improve the co-expression estimation. Overall, our benchmark setup contributes to the co-expression estimator development, and our study provides valuable insights for the community of single-cell data analyses. |
format | Online Article Text |
id | pubmed-9900775 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-99007752023-02-07 Investigating the Complexity of Gene Co-expression Estimation for Single-cell Data Zhang, Jiaqi Singh, Ritambhara bioRxiv Article With the rapid advance of single-cell RNA sequencing (scRNA-seq) technology, understanding biological processes at a more refined single-cell level is becoming possible. Gene co-expression estimation is an essential step in this direction. It can annotate functionalities of unknown genes or construct the basis of gene regulatory network inference. This study thoroughly tests the existing gene co-expression estimation methods on simulation datasets with known ground truth co-expression networks. We generate these novel datasets using two simulation processes that use the parameters learned from the experimental data. We demonstrate that these simulations better capture the underlying properties of the real-world single-cell datasets than previously tested simulations for the task. Our performance results on tens of simulated and eight experimental datasets show that all methods produce estimations with a high false discovery rate potentially caused by high-sparsity levels in the data. Finally, we find that commonly used pre-processing approaches, such as normalization and imputation, do not improve the co-expression estimation. Overall, our benchmark setup contributes to the co-expression estimator development, and our study provides valuable insights for the community of single-cell data analyses. Cold Spring Harbor Laboratory 2023-01-25 /pmc/articles/PMC9900775/ /pubmed/36747724 http://dx.doi.org/10.1101/2023.01.24.525447 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Zhang, Jiaqi Singh, Ritambhara Investigating the Complexity of Gene Co-expression Estimation for Single-cell Data |
title | Investigating the Complexity of Gene Co-expression Estimation for Single-cell Data |
title_full | Investigating the Complexity of Gene Co-expression Estimation for Single-cell Data |
title_fullStr | Investigating the Complexity of Gene Co-expression Estimation for Single-cell Data |
title_full_unstemmed | Investigating the Complexity of Gene Co-expression Estimation for Single-cell Data |
title_short | Investigating the Complexity of Gene Co-expression Estimation for Single-cell Data |
title_sort | investigating the complexity of gene co-expression estimation for single-cell data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9900775/ https://www.ncbi.nlm.nih.gov/pubmed/36747724 http://dx.doi.org/10.1101/2023.01.24.525447 |
work_keys_str_mv | AT zhangjiaqi investigatingthecomplexityofgenecoexpressionestimationforsinglecelldata AT singhritambhara investigatingthecomplexityofgenecoexpressionestimationforsinglecelldata |