Cargando…
A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples
BACKGROUND: The identification of cell type-specific genes (markers) is an essential step for the deconvolution of the cellular fractions, primarily, from the gene expression data of a bulk sample. However, the genes with significant changes identified by pair-wise comparisons cannot indeed represen...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7510109/ https://www.ncbi.nlm.nih.gov/pubmed/32967610 http://dx.doi.org/10.1186/s12864-020-06888-1 |
_version_ | 1783585721872809984 |
---|---|
author | Li, Huamei Sharma, Amit Ming, Wenglong Sun, Xiao Liu, Hongde |
author_facet | Li, Huamei Sharma, Amit Ming, Wenglong Sun, Xiao Liu, Hongde |
author_sort | Li, Huamei |
collection | PubMed |
description | BACKGROUND: The identification of cell type-specific genes (markers) is an essential step for the deconvolution of the cellular fractions, primarily, from the gene expression data of a bulk sample. However, the genes with significant changes identified by pair-wise comparisons cannot indeed represent the specificity of gene expression across multiple conditions. In addition, the knowledge about the identification of gene expression markers across multiple conditions is still paucity. RESULTS: Herein, we developed a hybrid tool, LinDeconSeq, which consists of 1) identifying marker genes using specificity scoring and mutual linearity strategies across any number of cell types, and 2) predicting cellular fractions of bulk samples using weighted robust linear regression with the marker genes identified in the first stage. On multiple publicly available datasets, the marker genes identified by LinDeconSeq demonstrated better accuracy and reproducibility compared to MGFM and RNentropy. Among deconvolution methods, LinDeconSeq showed low average deviations (≤0.0958) and high average Pearson correlations (≥0.8792) between the predicted and actual fractions on the benchmark datasets. Importantly, the cellular fractions predicted by LinDeconSeq appear to be relevant in the diagnosis of acute myeloid leukemia (AML). The distinct cellular fractions in granulocyte-monocyte progenitor (GMP), lymphoid-primed multipotent progenitor (LMPP) and monocytes (MONO) were found to be closely associated with AML compared to the healthy samples. Moreover, the heterogeneity of cellular fractions in AML patients divided these patients into two subgroups, differing in both prognosis and mutation patterns. GMP fraction was the most pronounced between these two subgroups, particularly, in SubgroupA, which was strongly associated with the better AML prognosis and the younger population. Totally, the identification of marker genes by LinDeconSeq represents the improved feature for deconvolution. The data processing strategy with regard to the cellular fractions used in this study also showed potential for the diagnosis and prognosis of diseases. CONCLUSIONS: Taken together, we developed a freely-available and open-source tool LinDeconSeq (https://github.com/lihuamei/LinDeconSeq), which includes marker identification and deconvolution procedures. LinDeconSeq is comparable to other current methods in terms of accuracy when applied to benchmark datasets and has broad application in clinical outcome and disease-specific molecular mechanisms. |
format | Online Article Text |
id | pubmed-7510109 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-75101092020-09-24 A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples Li, Huamei Sharma, Amit Ming, Wenglong Sun, Xiao Liu, Hongde BMC Genomics Research Article BACKGROUND: The identification of cell type-specific genes (markers) is an essential step for the deconvolution of the cellular fractions, primarily, from the gene expression data of a bulk sample. However, the genes with significant changes identified by pair-wise comparisons cannot indeed represent the specificity of gene expression across multiple conditions. In addition, the knowledge about the identification of gene expression markers across multiple conditions is still paucity. RESULTS: Herein, we developed a hybrid tool, LinDeconSeq, which consists of 1) identifying marker genes using specificity scoring and mutual linearity strategies across any number of cell types, and 2) predicting cellular fractions of bulk samples using weighted robust linear regression with the marker genes identified in the first stage. On multiple publicly available datasets, the marker genes identified by LinDeconSeq demonstrated better accuracy and reproducibility compared to MGFM and RNentropy. Among deconvolution methods, LinDeconSeq showed low average deviations (≤0.0958) and high average Pearson correlations (≥0.8792) between the predicted and actual fractions on the benchmark datasets. Importantly, the cellular fractions predicted by LinDeconSeq appear to be relevant in the diagnosis of acute myeloid leukemia (AML). The distinct cellular fractions in granulocyte-monocyte progenitor (GMP), lymphoid-primed multipotent progenitor (LMPP) and monocytes (MONO) were found to be closely associated with AML compared to the healthy samples. Moreover, the heterogeneity of cellular fractions in AML patients divided these patients into two subgroups, differing in both prognosis and mutation patterns. GMP fraction was the most pronounced between these two subgroups, particularly, in SubgroupA, which was strongly associated with the better AML prognosis and the younger population. Totally, the identification of marker genes by LinDeconSeq represents the improved feature for deconvolution. The data processing strategy with regard to the cellular fractions used in this study also showed potential for the diagnosis and prognosis of diseases. CONCLUSIONS: Taken together, we developed a freely-available and open-source tool LinDeconSeq (https://github.com/lihuamei/LinDeconSeq), which includes marker identification and deconvolution procedures. LinDeconSeq is comparable to other current methods in terms of accuracy when applied to benchmark datasets and has broad application in clinical outcome and disease-specific molecular mechanisms. BioMed Central 2020-09-23 /pmc/articles/PMC7510109/ /pubmed/32967610 http://dx.doi.org/10.1186/s12864-020-06888-1 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Li, Huamei Sharma, Amit Ming, Wenglong Sun, Xiao Liu, Hongde A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples |
title | A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples |
title_full | A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples |
title_fullStr | A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples |
title_full_unstemmed | A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples |
title_short | A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples |
title_sort | deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7510109/ https://www.ncbi.nlm.nih.gov/pubmed/32967610 http://dx.doi.org/10.1186/s12864-020-06888-1 |
work_keys_str_mv | AT lihuamei adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT sharmaamit adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT mingwenglong adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT sunxiao adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT liuhongde adeconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT lihuamei deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT sharmaamit deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT mingwenglong deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT sunxiao deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples AT liuhongde deconvolutionmethodanditsapplicationinanalyzingthecellularfractionsinacutemyeloidleukemiasamples |