Cargando…

Regulatory network-based imputation of dropouts in single-cell RNA sequencing data

Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusi...

Descripción completa

Detalles Bibliográficos
Autores principales: Leote, Ana Carolina, Wu, Xiaohui, Beyer, Andreas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8890719/
https://www.ncbi.nlm.nih.gov/pubmed/35176023
http://dx.doi.org/10.1371/journal.pcbi.1009849
_version_ 1784661704326512640
author Leote, Ana Carolina
Wu, Xiaohui
Beyer, Andreas
author_facet Leote, Ana Carolina
Wu, Xiaohui
Beyer, Andreas
author_sort Leote, Ana Carolina
collection PubMed
description Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Further, it is unknown if all genes equally benefit from imputation or which imputation method works best for a given gene. Here, we show that a transcriptional regulatory network learned from external, independent gene expression data improves dropout imputation. Using a variety of human scRNA-seq datasets we demonstrate that our network-based approach outperforms published state-of-the-art methods. The network-based approach performs particularly well for lowly expressed genes, including cell-type-specific transcriptional regulators. Further, the cell-to-cell variation of 11.3% to 48.8% of the genes could not be adequately imputed by any of the methods that we tested. In those cases gene expression levels were best predicted by the mean expression across all cells, i.e. assuming no measurable expression variation between cells. These findings suggest that different imputation methods are optimal for different genes. We thus implemented an R-package called ADImpute (available via Bioconductor https://bioconductor.org/packages/release/bioc/html/ADImpute.html) that automatically determines the best imputation method for each gene in a dataset. Our work represents a paradigm shift by demonstrating that there is no single best imputation method. Instead, we propose that imputation should maximally exploit external information and be adapted to gene-specific features, such as expression level and expression variation across cells.
format Online
Article
Text
id pubmed-8890719
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-88907192022-03-03 Regulatory network-based imputation of dropouts in single-cell RNA sequencing data Leote, Ana Carolina Wu, Xiaohui Beyer, Andreas PLoS Comput Biol Research Article Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Further, it is unknown if all genes equally benefit from imputation or which imputation method works best for a given gene. Here, we show that a transcriptional regulatory network learned from external, independent gene expression data improves dropout imputation. Using a variety of human scRNA-seq datasets we demonstrate that our network-based approach outperforms published state-of-the-art methods. The network-based approach performs particularly well for lowly expressed genes, including cell-type-specific transcriptional regulators. Further, the cell-to-cell variation of 11.3% to 48.8% of the genes could not be adequately imputed by any of the methods that we tested. In those cases gene expression levels were best predicted by the mean expression across all cells, i.e. assuming no measurable expression variation between cells. These findings suggest that different imputation methods are optimal for different genes. We thus implemented an R-package called ADImpute (available via Bioconductor https://bioconductor.org/packages/release/bioc/html/ADImpute.html) that automatically determines the best imputation method for each gene in a dataset. Our work represents a paradigm shift by demonstrating that there is no single best imputation method. Instead, we propose that imputation should maximally exploit external information and be adapted to gene-specific features, such as expression level and expression variation across cells. Public Library of Science 2022-02-17 /pmc/articles/PMC8890719/ /pubmed/35176023 http://dx.doi.org/10.1371/journal.pcbi.1009849 Text en © 2022 Leote et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Leote, Ana Carolina
Wu, Xiaohui
Beyer, Andreas
Regulatory network-based imputation of dropouts in single-cell RNA sequencing data
title Regulatory network-based imputation of dropouts in single-cell RNA sequencing data
title_full Regulatory network-based imputation of dropouts in single-cell RNA sequencing data
title_fullStr Regulatory network-based imputation of dropouts in single-cell RNA sequencing data
title_full_unstemmed Regulatory network-based imputation of dropouts in single-cell RNA sequencing data
title_short Regulatory network-based imputation of dropouts in single-cell RNA sequencing data
title_sort regulatory network-based imputation of dropouts in single-cell rna sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8890719/
https://www.ncbi.nlm.nih.gov/pubmed/35176023
http://dx.doi.org/10.1371/journal.pcbi.1009849
work_keys_str_mv AT leoteanacarolina regulatorynetworkbasedimputationofdropoutsinsinglecellrnasequencingdata
AT wuxiaohui regulatorynetworkbasedimputationofdropoutsinsinglecellrnasequencingdata
AT beyerandreas regulatorynetworkbasedimputationofdropoutsinsinglecellrnasequencingdata