Cargando…

A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples

BACKGROUND: The identification of cell type-specific genes (markers) is an essential step for the deconvolution of the cellular fractions, primarily, from the gene expression data of a bulk sample. However, the genes with significant changes identified by pair-wise comparisons cannot indeed represen...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Huamei, Sharma, Amit, Ming, Wenglong, Sun, Xiao, Liu, Hongde
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7510109/
https://www.ncbi.nlm.nih.gov/pubmed/32967610
http://dx.doi.org/10.1186/s12864-020-06888-1
_version_ 1783585721872809984
author Li, Huamei
Sharma, Amit
Ming, Wenglong
Sun, Xiao
Liu, Hongde
author_facet Li, Huamei
Sharma, Amit
Ming, Wenglong
Sun, Xiao
Liu, Hongde
author_sort Li, Huamei
collection PubMed
description BACKGROUND: The identification of cell type-specific genes (markers) is an essential step for the deconvolution of the cellular fractions, primarily, from the gene expression data of a bulk sample. However, the genes with significant changes identified by pair-wise comparisons cannot indeed represent the specificity of gene expression across multiple conditions. In addition, the knowledge about the identification of gene expression markers across multiple conditions is still paucity. RESULTS: Herein, we developed a hybrid tool, LinDeconSeq, which consists of 1) identifying marker genes using specificity scoring and mutual linearity strategies across any number of cell types, and 2) predicting cellular fractions of bulk samples using weighted robust linear regression with the marker genes identified in the first stage. On multiple publicly available datasets, the marker genes identified by LinDeconSeq demonstrated better accuracy and reproducibility compared to MGFM and RNentropy. Among deconvolution methods, LinDeconSeq showed low average deviations (≤0.0958) and high average Pearson correlations (≥0.8792) between the predicted and actual fractions on the benchmark datasets. Importantly, the cellular fractions predicted by LinDeconSeq appear to be relevant in the diagnosis of acute myeloid leukemia (AML). The distinct cellular fractions in granulocyte-monocyte progenitor (GMP), lymphoid-primed multipotent progenitor (LMPP) and monocytes (MONO) were found to be closely associated with AML compared to the healthy samples. Moreover, the heterogeneity of cellular fractions in AML patients divided these patients into two subgroups, differing in both prognosis and mutation patterns. GMP fraction was the most pronounced between these two subgroups, particularly, in SubgroupA, which was strongly associated with the better AML prognosis and the younger population. Totally, the identification of marker genes by LinDeconSeq represents the improved feature for deconvolution. The data processing strategy with regard to the cellular fractions used in this study also showed potential for the diagnosis and prognosis of diseases. CONCLUSIONS: Taken together, we developed a freely-available and open-source tool LinDeconSeq (https://github.com/lihuamei/LinDeconSeq), which includes marker identification and deconvolution procedures. LinDeconSeq is comparable to other current methods in terms of accuracy when applied to benchmark datasets and has broad application in clinical outcome and disease-specific molecular mechanisms.
format Online
Article
Text
id pubmed-7510109
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-75101092020-09-24 A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples Li, Huamei Sharma, Amit Ming, Wenglong Sun, Xiao Liu, Hongde BMC Genomics Research Article BACKGROUND: The identification of cell type-specific genes (markers) is an essential step for the deconvolution of the cellular fractions, primarily, from the gene expression data of a bulk sample. However, the genes with significant changes identified by pair-wise comparisons cannot indeed represent the specificity of gene expression across multiple conditions. In addition, the knowledge about the identification of gene expression markers across multiple conditions is still paucity. RESULTS: Herein, we developed a hybrid tool, LinDeconSeq, which consists of 1) identifying marker genes using specificity scoring and mutual linearity strategies across any number of cell types, and 2) predicting cellular fractions of bulk samples using weighted robust linear regression with the marker genes identified in the first stage. On multiple publicly available datasets, the marker genes identified by LinDeconSeq demonstrated better accuracy and reproducibility compared to MGFM and RNentropy. Among deconvolution methods, LinDeconSeq showed low average deviations (≤0.0958) and high average Pearson correlations (≥0.8792) between the predicted and actual fractions on the benchmark datasets. Importantly, the cellular fractions predicted by LinDeconSeq appear to be relevant in the diagnosis of acute myeloid leukemia (AML). The distinct cellular fractions in granulocyte-monocyte progenitor (GMP), lymphoid-primed multipotent progenitor (LMPP) and monocytes (MONO) were found to be closely associated with AML compared to the healthy samples. Moreover, the heterogeneity of cellular fractions in AML patients divided these patients into two subgroups, differing in both prognosis and mutation patterns. GMP fraction was the most pronounced between these two subgroups, particularly, in SubgroupA, which was strongly associated with the better AML prognosis and the younger population. Totally, the identification of marker genes by LinDeconSeq represents the improved feature for deconvolution. The data processing strategy with regard to the cellular fractions used in this study also showed potential for the diagnosis and prognosis of diseases. CONCLUSIONS: Taken together, we developed a freely-available and open-source tool LinDeconSeq (https://github.com/lihuamei/LinDeconSeq), which includes marker identification and deconvolution procedures. LinDeconSeq is comparable to other current methods in terms of accuracy when applied to benchmark datasets and has broad application in clinical outcome and disease-specific molecular mechanisms. BioMed Central 2020-09-23 /pmc/articles/PMC7510109/ /pubmed/32967610 http://dx.doi.org/10.1186/s12864-020-06888-1 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Li, Huamei
Sharma, Amit
Ming, Wenglong
Sun, Xiao
Liu, Hongde
A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples
title A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples
title_full A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples
title_fullStr A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples
title_full_unstemmed A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples
title_short A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples
title_sort deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7510109/
https://www.ncbi.nlm.nih.gov/pubmed/32967610
http://dx.doi.org/10.1186/s12864-020-06888-1
work_keys_str_mv AT lihuamei adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT sharmaamit adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT mingwenglong adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT sunxiao adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT liuhongde adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT lihuamei deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT sharmaamit deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT mingwenglong deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT sunxiao deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples
AT liuhongde deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples