Cargando…
Regulatory network-based imputation of dropouts in single-cell RNA sequencing data
Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8890719/ https://www.ncbi.nlm.nih.gov/pubmed/35176023 http://dx.doi.org/10.1371/journal.pcbi.1009849 |
_version_ | 1784661704326512640 |
---|---|
author | Leote, Ana Carolina Wu, Xiaohui Beyer, Andreas |
author_facet | Leote, Ana Carolina Wu, Xiaohui Beyer, Andreas |
author_sort | Leote, Ana Carolina |
collection | PubMed |
description | Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Further, it is unknown if all genes equally benefit from imputation or which imputation method works best for a given gene. Here, we show that a transcriptional regulatory network learned from external, independent gene expression data improves dropout imputation. Using a variety of human scRNA-seq datasets we demonstrate that our network-based approach outperforms published state-of-the-art methods. The network-based approach performs particularly well for lowly expressed genes, including cell-type-specific transcriptional regulators. Further, the cell-to-cell variation of 11.3% to 48.8% of the genes could not be adequately imputed by any of the methods that we tested. In those cases gene expression levels were best predicted by the mean expression across all cells, i.e. assuming no measurable expression variation between cells. These findings suggest that different imputation methods are optimal for different genes. We thus implemented an R-package called ADImpute (available via Bioconductor https://bioconductor.org/packages/release/bioc/html/ADImpute.html) that automatically determines the best imputation method for each gene in a dataset. Our work represents a paradigm shift by demonstrating that there is no single best imputation method. Instead, we propose that imputation should maximally exploit external information and be adapted to gene-specific features, such as expression level and expression variation across cells. |
format | Online Article Text |
id | pubmed-8890719 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-88907192022-03-03 Regulatory network-based imputation of dropouts in single-cell RNA sequencing data Leote, Ana Carolina Wu, Xiaohui Beyer, Andreas PLoS Comput Biol Research Article Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Further, it is unknown if all genes equally benefit from imputation or which imputation method works best for a given gene. Here, we show that a transcriptional regulatory network learned from external, independent gene expression data improves dropout imputation. Using a variety of human scRNA-seq datasets we demonstrate that our network-based approach outperforms published state-of-the-art methods. The network-based approach performs particularly well for lowly expressed genes, including cell-type-specific transcriptional regulators. Further, the cell-to-cell variation of 11.3% to 48.8% of the genes could not be adequately imputed by any of the methods that we tested. In those cases gene expression levels were best predicted by the mean expression across all cells, i.e. assuming no measurable expression variation between cells. These findings suggest that different imputation methods are optimal for different genes. We thus implemented an R-package called ADImpute (available via Bioconductor https://bioconductor.org/packages/release/bioc/html/ADImpute.html) that automatically determines the best imputation method for each gene in a dataset. Our work represents a paradigm shift by demonstrating that there is no single best imputation method. Instead, we propose that imputation should maximally exploit external information and be adapted to gene-specific features, such as expression level and expression variation across cells. Public Library of Science 2022-02-17 /pmc/articles/PMC8890719/ /pubmed/35176023 http://dx.doi.org/10.1371/journal.pcbi.1009849 Text en © 2022 Leote et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Leote, Ana Carolina Wu, Xiaohui Beyer, Andreas Regulatory network-based imputation of dropouts in single-cell RNA sequencing data |
title | Regulatory network-based imputation of dropouts in single-cell RNA sequencing data |
title_full | Regulatory network-based imputation of dropouts in single-cell RNA sequencing data |
title_fullStr | Regulatory network-based imputation of dropouts in single-cell RNA sequencing data |
title_full_unstemmed | Regulatory network-based imputation of dropouts in single-cell RNA sequencing data |
title_short | Regulatory network-based imputation of dropouts in single-cell RNA sequencing data |
title_sort | regulatory network-based imputation of dropouts in single-cell rna sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8890719/ https://www.ncbi.nlm.nih.gov/pubmed/35176023 http://dx.doi.org/10.1371/journal.pcbi.1009849 |
work_keys_str_mv | AT leoteanacarolina regulatorynetworkbasedimputationofdropoutsinsinglecellrnasequencingdata AT wuxiaohui regulatorynetworkbasedimputationofdropoutsinsinglecellrnasequencingdata AT beyerandreas regulatorynetworkbasedimputationofdropoutsinsinglecellrnasequencingdata |