Cargando…
Maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile
BACKGROUND: Microarray data have been used for gene signature selection to predict clinical outcomes. Many studies have attempted to identify factors that affect models' performance with only little success. Fine-tuning of model parameters and optimizing each step of the modeling process often...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287499/ https://www.ncbi.nlm.nih.gov/pubmed/22369035 http://dx.doi.org/10.1186/1471-2164-12-S5-S3 |
_version_ | 1782224677118148608 |
---|---|
author | Zhao, Chen Shi, Leming Tong, Weida Shaughnessy, John D Oberthuer, André Pusztai, Lajos Deng, Youping Symmans, W Fraser Shi, Tieliu |
author_facet | Zhao, Chen Shi, Leming Tong, Weida Shaughnessy, John D Oberthuer, André Pusztai, Lajos Deng, Youping Symmans, W Fraser Shi, Tieliu |
author_sort | Zhao, Chen |
collection | PubMed |
description | BACKGROUND: Microarray data have been used for gene signature selection to predict clinical outcomes. Many studies have attempted to identify factors that affect models' performance with only little success. Fine-tuning of model parameters and optimizing each step of the modeling process often results in over-fitting problems without improving performance. RESULTS: We propose a quantitative measurement, termed consistency degree, to detect the correlation between disease endpoint and gene expression profile. Different endpoints were shown to have different consistency degrees to gene expression profiles. The validity of this measurement to estimate the consistency was tested with significance at a p-value less than 2.2e-16 for all of the studied endpoints. According to the consistency degree score, overall survival milestone outcome of multiple myeloma was proposed to extend from 730 days to 1561 days, which is more consistent with gene expression profile. CONCLUSION: For various clinical endpoints, the maximum predictive powers of different microarray-based models are limited by the correlation between endpoint and gene expression profile of disease samples as indicated by the consistency degree score. In addition, previous defined clinical outcomes can also be reassessed and refined more coherent according to related disease gene expression profile. Our findings point to an entirely new direction for assessing the microarray-based predictive models and provide important information to gene signature based clinical applications. |
format | Online Article Text |
id | pubmed-3287499 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32874992012-03-01 Maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile Zhao, Chen Shi, Leming Tong, Weida Shaughnessy, John D Oberthuer, André Pusztai, Lajos Deng, Youping Symmans, W Fraser Shi, Tieliu BMC Genomics Research Article BACKGROUND: Microarray data have been used for gene signature selection to predict clinical outcomes. Many studies have attempted to identify factors that affect models' performance with only little success. Fine-tuning of model parameters and optimizing each step of the modeling process often results in over-fitting problems without improving performance. RESULTS: We propose a quantitative measurement, termed consistency degree, to detect the correlation between disease endpoint and gene expression profile. Different endpoints were shown to have different consistency degrees to gene expression profiles. The validity of this measurement to estimate the consistency was tested with significance at a p-value less than 2.2e-16 for all of the studied endpoints. According to the consistency degree score, overall survival milestone outcome of multiple myeloma was proposed to extend from 730 days to 1561 days, which is more consistent with gene expression profile. CONCLUSION: For various clinical endpoints, the maximum predictive powers of different microarray-based models are limited by the correlation between endpoint and gene expression profile of disease samples as indicated by the consistency degree score. In addition, previous defined clinical outcomes can also be reassessed and refined more coherent according to related disease gene expression profile. Our findings point to an entirely new direction for assessing the microarray-based predictive models and provide important information to gene signature based clinical applications. BioMed Central 2011-12-23 /pmc/articles/PMC3287499/ /pubmed/22369035 http://dx.doi.org/10.1186/1471-2164-12-S5-S3 Text en Copyright ©2011 Zhao et al. licensee BioMed Central Ltd http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhao, Chen Shi, Leming Tong, Weida Shaughnessy, John D Oberthuer, André Pusztai, Lajos Deng, Youping Symmans, W Fraser Shi, Tieliu Maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile |
title | Maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile |
title_full | Maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile |
title_fullStr | Maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile |
title_full_unstemmed | Maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile |
title_short | Maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile |
title_sort | maximum predictive power of the microarray-based models for clinical outcomes is limited by correlation between endpoint and gene expression profile |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287499/ https://www.ncbi.nlm.nih.gov/pubmed/22369035 http://dx.doi.org/10.1186/1471-2164-12-S5-S3 |
work_keys_str_mv | AT zhaochen maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile AT shileming maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile AT tongweida maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile AT shaughnessyjohnd maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile AT oberthuerandre maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile AT pusztailajos maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile AT dengyouping maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile AT symmanswfraser maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile AT shitieliu maximumpredictivepowerofthemicroarraybasedmodelsforclinicaloutcomesislimitedbycorrelationbetweenendpointandgeneexpressionprofile |