Cargando…
Quantifying stability in gene list ranking across microarray derived clinical biomarkers
BACKGROUND: Identifying stable gene lists for diagnosis, prognosis prediction, and treatment guidance of tumors remains a major challenge in cancer research. Microarrays measuring differential gene expression are widely used and should be versatile predictors of disease and other phenotypic data. Ho...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3206838/ https://www.ncbi.nlm.nih.gov/pubmed/21996057 http://dx.doi.org/10.1186/1755-8794-4-73 |
_version_ | 1782215492234117120 |
---|---|
author | Schneckener, Sebastian Arden, Nilou S Schuppert, Andreas |
author_facet | Schneckener, Sebastian Arden, Nilou S Schuppert, Andreas |
author_sort | Schneckener, Sebastian |
collection | PubMed |
description | BACKGROUND: Identifying stable gene lists for diagnosis, prognosis prediction, and treatment guidance of tumors remains a major challenge in cancer research. Microarrays measuring differential gene expression are widely used and should be versatile predictors of disease and other phenotypic data. However, gene expression profile studies and predictive biomarkers are often of low power, requiring numerous samples for a sound statistic, or vary between studies. Given the inconsistency of results across similar studies, methods that identify robust biomarkers from microarray data are needed to relay true biological information. Here we present a method to demonstrate that gene list stability and predictive power depends not only on the size of studies, but also on the clinical phenotype. RESULTS: Our method projects genomic tumor expression data to a lower dimensional space representing the main variation in the data. Some information regarding the phenotype resides in this low dimensional space, while some information resides in the residuum. We then introduce an information ratio (IR) as a metric defined by the partition between projected and residual space. Upon grouping phenotypes such as tumor tissue, histological grades, relapse, or aging, we show that higher IR values correlated with phenotypes that yield less robust biomarkers whereas lower IR values showed higher transferability across studies. Our results indicate that the IR is correlated with predictive accuracy. When tested across different published datasets, the IR can identify information-rich data characterizing clinical phenotypes and stable biomarkers. CONCLUSIONS: The IR presents a quantitative metric to estimate the information content of gene expression data with respect to particular phenotypes. |
format | Online Article Text |
id | pubmed-3206838 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32068382011-11-04 Quantifying stability in gene list ranking across microarray derived clinical biomarkers Schneckener, Sebastian Arden, Nilou S Schuppert, Andreas BMC Med Genomics Research Article BACKGROUND: Identifying stable gene lists for diagnosis, prognosis prediction, and treatment guidance of tumors remains a major challenge in cancer research. Microarrays measuring differential gene expression are widely used and should be versatile predictors of disease and other phenotypic data. However, gene expression profile studies and predictive biomarkers are often of low power, requiring numerous samples for a sound statistic, or vary between studies. Given the inconsistency of results across similar studies, methods that identify robust biomarkers from microarray data are needed to relay true biological information. Here we present a method to demonstrate that gene list stability and predictive power depends not only on the size of studies, but also on the clinical phenotype. RESULTS: Our method projects genomic tumor expression data to a lower dimensional space representing the main variation in the data. Some information regarding the phenotype resides in this low dimensional space, while some information resides in the residuum. We then introduce an information ratio (IR) as a metric defined by the partition between projected and residual space. Upon grouping phenotypes such as tumor tissue, histological grades, relapse, or aging, we show that higher IR values correlated with phenotypes that yield less robust biomarkers whereas lower IR values showed higher transferability across studies. Our results indicate that the IR is correlated with predictive accuracy. When tested across different published datasets, the IR can identify information-rich data characterizing clinical phenotypes and stable biomarkers. CONCLUSIONS: The IR presents a quantitative metric to estimate the information content of gene expression data with respect to particular phenotypes. BioMed Central 2011-10-14 /pmc/articles/PMC3206838/ /pubmed/21996057 http://dx.doi.org/10.1186/1755-8794-4-73 Text en Copyright ©2011 Schneckener et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Schneckener, Sebastian Arden, Nilou S Schuppert, Andreas Quantifying stability in gene list ranking across microarray derived clinical biomarkers |
title | Quantifying stability in gene list ranking across microarray derived clinical biomarkers |
title_full | Quantifying stability in gene list ranking across microarray derived clinical biomarkers |
title_fullStr | Quantifying stability in gene list ranking across microarray derived clinical biomarkers |
title_full_unstemmed | Quantifying stability in gene list ranking across microarray derived clinical biomarkers |
title_short | Quantifying stability in gene list ranking across microarray derived clinical biomarkers |
title_sort | quantifying stability in gene list ranking across microarray derived clinical biomarkers |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3206838/ https://www.ncbi.nlm.nih.gov/pubmed/21996057 http://dx.doi.org/10.1186/1755-8794-4-73 |
work_keys_str_mv | AT schneckenersebastian quantifyingstabilityingenelistrankingacrossmicroarrayderivedclinicalbiomarkers AT ardennilous quantifyingstabilityingenelistrankingacrossmicroarrayderivedclinicalbiomarkers AT schuppertandreas quantifyingstabilityingenelistrankingacrossmicroarrayderivedclinicalbiomarkers |