Cargando…

Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers

BACKGROUND: The lack of consensus among reported gene signature subsets (GSSs) in multi-gene biomarker discovery studies is often a concern for researchers and clinicians. Subsequently, it discourages larger scale prospective studies, prevents the translation of such knowledge into a practical clini...

Descripción completa

Detalles Bibliográficos
Autores principales: Ow, Ghim Siong, Kuznetsov, Vladimir A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474413/
https://www.ncbi.nlm.nih.gov/pubmed/26100469
http://dx.doi.org/10.1186/1471-2164-16-S7-S2
_version_ 1782377267724288000
author Ow, Ghim Siong
Kuznetsov, Vladimir A
author_facet Ow, Ghim Siong
Kuznetsov, Vladimir A
author_sort Ow, Ghim Siong
collection PubMed
description BACKGROUND: The lack of consensus among reported gene signature subsets (GSSs) in multi-gene biomarker discovery studies is often a concern for researchers and clinicians. Subsequently, it discourages larger scale prospective studies, prevents the translation of such knowledge into a practical clinical setting and ultimately hinders the progress of the field of biomarker-based disease classification, prognosis and prediction. METHODS: We define all "gene identificators" (gIDs) as constituents of the entire potential disease biomarker space. For each gID in a GSS of interest ("tested GSS"/tGSS), our method counts the empirical frequency of gID co-occurrences/overlaps in other reference GSSs (rGSSs) and compares it with the expected frequency generated via implementation of a randomized sampling procedure. Comparison of the empirical frequency distribution (EFD) with the expected background frequency distribution (BFD) allows dichotomization of statistically novel (SN) and common (SC) gIDs within the tGSS. RESULTS: We identify SN or SC biomarkers for tGSSs obtained from previous studies of high-grade serous ovarian cancer (HG-SOC) and breast cancer (BC). For each tGSS, the EFD of gID co-occurrences/overlaps with other rGSSs is characterized by scale and context-dependent Pareto-like frequency distribution function. Our results indicate that while independently there is little overlap between our tGSS with individual rGSSs, comparison of the EFD with BFD suggests that beyond a confidence threshold, tested gIDs become more common in rGSSs than expected. This validates the use of our tGSS as individual or combined prognostic factors. Our method identifies SN and SC genes of a 36-gene prognostic signature that stratify HG-SOC patients into subgroups with low, intermediate or high-risk of the disease outcome. Using 70 BC rGSSs, the method also predicted SN and SC BC prognostic genes from the tested obesity and IGF1 pathway GSSs. CONCLUSIONS: Our method provides a strategy that identify/predict within a tGSS of interest, gID subsets that are either SN or SC when compared to other rGSSs. Practically, our results suggest that there is a stronger association of the IGF1 signature genes with the 70 BC rGSSs, than for the obesity-associated signature. Furthermore, both SC and SN genes, in both signatures could be considered as perspective prognostic biomarkers of BCs that stratify the patients onto low or high risks of cancer development.
format Online
Article
Text
id pubmed-4474413
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44744132015-06-25 Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers Ow, Ghim Siong Kuznetsov, Vladimir A BMC Genomics Research BACKGROUND: The lack of consensus among reported gene signature subsets (GSSs) in multi-gene biomarker discovery studies is often a concern for researchers and clinicians. Subsequently, it discourages larger scale prospective studies, prevents the translation of such knowledge into a practical clinical setting and ultimately hinders the progress of the field of biomarker-based disease classification, prognosis and prediction. METHODS: We define all "gene identificators" (gIDs) as constituents of the entire potential disease biomarker space. For each gID in a GSS of interest ("tested GSS"/tGSS), our method counts the empirical frequency of gID co-occurrences/overlaps in other reference GSSs (rGSSs) and compares it with the expected frequency generated via implementation of a randomized sampling procedure. Comparison of the empirical frequency distribution (EFD) with the expected background frequency distribution (BFD) allows dichotomization of statistically novel (SN) and common (SC) gIDs within the tGSS. RESULTS: We identify SN or SC biomarkers for tGSSs obtained from previous studies of high-grade serous ovarian cancer (HG-SOC) and breast cancer (BC). For each tGSS, the EFD of gID co-occurrences/overlaps with other rGSSs is characterized by scale and context-dependent Pareto-like frequency distribution function. Our results indicate that while independently there is little overlap between our tGSS with individual rGSSs, comparison of the EFD with BFD suggests that beyond a confidence threshold, tested gIDs become more common in rGSSs than expected. This validates the use of our tGSS as individual or combined prognostic factors. Our method identifies SN and SC genes of a 36-gene prognostic signature that stratify HG-SOC patients into subgroups with low, intermediate or high-risk of the disease outcome. Using 70 BC rGSSs, the method also predicted SN and SC BC prognostic genes from the tested obesity and IGF1 pathway GSSs. CONCLUSIONS: Our method provides a strategy that identify/predict within a tGSS of interest, gID subsets that are either SN or SC when compared to other rGSSs. Practically, our results suggest that there is a stronger association of the IGF1 signature genes with the 70 BC rGSSs, than for the obesity-associated signature. Furthermore, both SC and SN genes, in both signatures could be considered as perspective prognostic biomarkers of BCs that stratify the patients onto low or high risks of cancer development. BioMed Central 2015-06-11 /pmc/articles/PMC4474413/ /pubmed/26100469 http://dx.doi.org/10.1186/1471-2164-16-S7-S2 Text en Copyright © 2015 Ow and Kuznetsov; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Ow, Ghim Siong
Kuznetsov, Vladimir A
Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers
title Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers
title_full Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers
title_fullStr Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers
title_full_unstemmed Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers
title_short Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers
title_sort multiple signatures of a disease in potential biomarker space: getting the signatures consensus and identification of novel biomarkers
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474413/
https://www.ncbi.nlm.nih.gov/pubmed/26100469
http://dx.doi.org/10.1186/1471-2164-16-S7-S2
work_keys_str_mv AT owghimsiong multiplesignaturesofadiseaseinpotentialbiomarkerspacegettingthesignaturesconsensusandidentificationofnovelbiomarkers
AT kuznetsovvladimira multiplesignaturesofadiseaseinpotentialbiomarkerspacegettingthesignaturesconsensusandidentificationofnovelbiomarkers