Cargando…

A novel method for predicting cell abundance based on single-cell RNA-seq data

BACKGROUND: It is important to understand the composition of cell type and its proportion in intact tissues, as changes in certain cell types are the underlying cause of disease in humans. Although compositions of cell type and ratios can be obtained by single-cell sequencing, single-cell sequencing...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Jiajie, Han, Lu, Shang, Xuequn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8386079/
https://www.ncbi.nlm.nih.gov/pubmed/34433409
http://dx.doi.org/10.1186/s12859-021-04187-4
_version_ 1783742194511773696
author Peng, Jiajie
Han, Lu
Shang, Xuequn
author_facet Peng, Jiajie
Han, Lu
Shang, Xuequn
author_sort Peng, Jiajie
collection PubMed
description BACKGROUND: It is important to understand the composition of cell type and its proportion in intact tissues, as changes in certain cell types are the underlying cause of disease in humans. Although compositions of cell type and ratios can be obtained by single-cell sequencing, single-cell sequencing is currently expensive and cannot be applied in clinical studies involving a large number of subjects. Therefore, it is useful to apply the bulk RNA-Seq dataset and the single-cell RNA dataset to deconvolute and obtain the cell type composition in the tissue. RESULTS: By analyzing the existing cell population prediction methods, we found that most of the existing methods need the cell-type-specific gene expression profile as the input of the signature matrix. However, in real applications, it is not always possible to find an available signature matrix. To solve this problem, we proposed a novel method, named DCap, to predict cell abundance. DCap is a deconvolution method based on non-negative least squares. DCap considers the weight resulting from measurement noise of bulk RNA-seq and calculation error of single-cell RNA-seq data, during the calculation process of non-negative least squares and performs the weighted iterative calculation based on least squares. By weighting the bulk tissue gene expression matrix and single-cell gene expression matrix, DCap minimizes the measurement error of bulk RNA-Seq and also reduces errors resulting from differences in the number of expressed genes in the same type of cells in different samples. Evaluation test shows that DCap performs better in cell type abundance prediction than existing methods. CONCLUSION: DCap solves the deconvolution problem using weighted non-negative least squares to predict cell type abundance in tissues. DCap has better prediction results and does not need to prepare a signature matrix that gives the cell-type-specific gene expression profile in advance. By using DCap, we can better study the changes in cell proportion in diseased tissues and provide more information on the follow-up treatment of diseases.
format Online
Article
Text
id pubmed-8386079
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-83860792021-08-26 A novel method for predicting cell abundance based on single-cell RNA-seq data Peng, Jiajie Han, Lu Shang, Xuequn BMC Bioinformatics Research BACKGROUND: It is important to understand the composition of cell type and its proportion in intact tissues, as changes in certain cell types are the underlying cause of disease in humans. Although compositions of cell type and ratios can be obtained by single-cell sequencing, single-cell sequencing is currently expensive and cannot be applied in clinical studies involving a large number of subjects. Therefore, it is useful to apply the bulk RNA-Seq dataset and the single-cell RNA dataset to deconvolute and obtain the cell type composition in the tissue. RESULTS: By analyzing the existing cell population prediction methods, we found that most of the existing methods need the cell-type-specific gene expression profile as the input of the signature matrix. However, in real applications, it is not always possible to find an available signature matrix. To solve this problem, we proposed a novel method, named DCap, to predict cell abundance. DCap is a deconvolution method based on non-negative least squares. DCap considers the weight resulting from measurement noise of bulk RNA-seq and calculation error of single-cell RNA-seq data, during the calculation process of non-negative least squares and performs the weighted iterative calculation based on least squares. By weighting the bulk tissue gene expression matrix and single-cell gene expression matrix, DCap minimizes the measurement error of bulk RNA-Seq and also reduces errors resulting from differences in the number of expressed genes in the same type of cells in different samples. Evaluation test shows that DCap performs better in cell type abundance prediction than existing methods. CONCLUSION: DCap solves the deconvolution problem using weighted non-negative least squares to predict cell type abundance in tissues. DCap has better prediction results and does not need to prepare a signature matrix that gives the cell-type-specific gene expression profile in advance. By using DCap, we can better study the changes in cell proportion in diseased tissues and provide more information on the follow-up treatment of diseases. BioMed Central 2021-08-25 /pmc/articles/PMC8386079/ /pubmed/34433409 http://dx.doi.org/10.1186/s12859-021-04187-4 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Peng, Jiajie
Han, Lu
Shang, Xuequn
A novel method for predicting cell abundance based on single-cell RNA-seq data
title A novel method for predicting cell abundance based on single-cell RNA-seq data
title_full A novel method for predicting cell abundance based on single-cell RNA-seq data
title_fullStr A novel method for predicting cell abundance based on single-cell RNA-seq data
title_full_unstemmed A novel method for predicting cell abundance based on single-cell RNA-seq data
title_short A novel method for predicting cell abundance based on single-cell RNA-seq data
title_sort novel method for predicting cell abundance based on single-cell rna-seq data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8386079/
https://www.ncbi.nlm.nih.gov/pubmed/34433409
http://dx.doi.org/10.1186/s12859-021-04187-4
work_keys_str_mv AT pengjiajie anovelmethodforpredictingcellabundancebasedonsinglecellrnaseqdata
AT hanlu anovelmethodforpredictingcellabundancebasedonsinglecellrnaseqdata
AT shangxuequn anovelmethodforpredictingcellabundancebasedonsinglecellrnaseqdata
AT pengjiajie novelmethodforpredictingcellabundancebasedonsinglecellrnaseqdata
AT hanlu novelmethodforpredictingcellabundancebasedonsinglecellrnaseqdata
AT shangxuequn novelmethodforpredictingcellabundancebasedonsinglecellrnaseqdata