Cargando…

Adaptively capturing the heterogeneity of expression for cancer biomarker identification

BACKGROUND: Identifying cancer biomarkers from transcriptomics data is of importance to cancer research. However, transcriptomics data are often complex and heterogeneous, which complicates the identification of cancer biomarkers in practice. Currently, the heterogeneity still remains a challenge fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Xie, Xin-Ping, Xie, Yu-Feng, Liu, Yi-Tong, Wang, Hong-Qiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6215657/
https://www.ncbi.nlm.nih.gov/pubmed/30390627
http://dx.doi.org/10.1186/s12859-018-2437-2
_version_ 1783368186556579840
author Xie, Xin-Ping
Xie, Yu-Feng
Liu, Yi-Tong
Wang, Hong-Qiang
author_facet Xie, Xin-Ping
Xie, Yu-Feng
Liu, Yi-Tong
Wang, Hong-Qiang
author_sort Xie, Xin-Ping
collection PubMed
description BACKGROUND: Identifying cancer biomarkers from transcriptomics data is of importance to cancer research. However, transcriptomics data are often complex and heterogeneous, which complicates the identification of cancer biomarkers in practice. Currently, the heterogeneity still remains a challenge for detecting subtle but consistent changes of gene expression in cancer cells. RESULTS: In this paper, we propose to adaptively capture the heterogeneity of expression across samples in a gene regulation space instead of in a gene expression space. Specifically, we transform gene expression profiles into gene regulation profiles and mathematically formulate gene regulation probabilities (GRPs)-based statistics for characterizing differential expression of genes between tumor and normal tissues. Finally, an unbiased estimator (aGRP) of GRPs is devised that can interrogate and adaptively capture the heterogeneity of gene expression. We also derived an asymptotical significance analysis procedure for the new statistic. Since no parameter needs to be preset, aGRP is easy and friendly to use for researchers without computer programming background. We evaluated the proposed method on both simulated data and real-world data and compared with previous methods. Experimental results demonstrated the superior performance of the proposed method in exploring the heterogeneity of expression for capturing subtle but consistent alterations of gene expression in cancer. CONCLUSIONS: Expression heterogeneity largely influences the performance of cancer biomarker identification from transcriptomics data. Models are needed that efficiently deal with the expression heterogeneity. The proposed method can be a standalone tool due to its capacity of adaptively capturing the sample heterogeneity and the simplicity in use. SOFTWARE AVAILABILITY: The source code of aGRP can be downloaded from https://github.com/hqwang126/aGRP. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2437-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6215657
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62156572018-11-08 Adaptively capturing the heterogeneity of expression for cancer biomarker identification Xie, Xin-Ping Xie, Yu-Feng Liu, Yi-Tong Wang, Hong-Qiang BMC Bioinformatics Research Article BACKGROUND: Identifying cancer biomarkers from transcriptomics data is of importance to cancer research. However, transcriptomics data are often complex and heterogeneous, which complicates the identification of cancer biomarkers in practice. Currently, the heterogeneity still remains a challenge for detecting subtle but consistent changes of gene expression in cancer cells. RESULTS: In this paper, we propose to adaptively capture the heterogeneity of expression across samples in a gene regulation space instead of in a gene expression space. Specifically, we transform gene expression profiles into gene regulation profiles and mathematically formulate gene regulation probabilities (GRPs)-based statistics for characterizing differential expression of genes between tumor and normal tissues. Finally, an unbiased estimator (aGRP) of GRPs is devised that can interrogate and adaptively capture the heterogeneity of gene expression. We also derived an asymptotical significance analysis procedure for the new statistic. Since no parameter needs to be preset, aGRP is easy and friendly to use for researchers without computer programming background. We evaluated the proposed method on both simulated data and real-world data and compared with previous methods. Experimental results demonstrated the superior performance of the proposed method in exploring the heterogeneity of expression for capturing subtle but consistent alterations of gene expression in cancer. CONCLUSIONS: Expression heterogeneity largely influences the performance of cancer biomarker identification from transcriptomics data. Models are needed that efficiently deal with the expression heterogeneity. The proposed method can be a standalone tool due to its capacity of adaptively capturing the sample heterogeneity and the simplicity in use. SOFTWARE AVAILABILITY: The source code of aGRP can be downloaded from https://github.com/hqwang126/aGRP. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2437-2) contains supplementary material, which is available to authorized users. BioMed Central 2018-11-03 /pmc/articles/PMC6215657/ /pubmed/30390627 http://dx.doi.org/10.1186/s12859-018-2437-2 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Xie, Xin-Ping
Xie, Yu-Feng
Liu, Yi-Tong
Wang, Hong-Qiang
Adaptively capturing the heterogeneity of expression for cancer biomarker identification
title Adaptively capturing the heterogeneity of expression for cancer biomarker identification
title_full Adaptively capturing the heterogeneity of expression for cancer biomarker identification
title_fullStr Adaptively capturing the heterogeneity of expression for cancer biomarker identification
title_full_unstemmed Adaptively capturing the heterogeneity of expression for cancer biomarker identification
title_short Adaptively capturing the heterogeneity of expression for cancer biomarker identification
title_sort adaptively capturing the heterogeneity of expression for cancer biomarker identification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6215657/
https://www.ncbi.nlm.nih.gov/pubmed/30390627
http://dx.doi.org/10.1186/s12859-018-2437-2
work_keys_str_mv AT xiexinping adaptivelycapturingtheheterogeneityofexpressionforcancerbiomarkeridentification
AT xieyufeng adaptivelycapturingtheheterogeneityofexpressionforcancerbiomarkeridentification
AT liuyitong adaptivelycapturingtheheterogeneityofexpressionforcancerbiomarkeridentification
AT wanghongqiang adaptivelycapturingtheheterogeneityofexpressionforcancerbiomarkeridentification