Cargando…
Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment
The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702597/ https://www.ncbi.nlm.nih.gov/pubmed/23861920 http://dx.doi.org/10.1371/journal.pone.0068579 |
_version_ | 1782275841494876160 |
---|---|
author | Shao, Li Fan, Xiaohui Cheng, Ningtao Wu, Leihong Cheng, Yiyu |
author_facet | Shao, Li Fan, Xiaohui Cheng, Ningtao Wu, Leihong Cheng, Yiyu |
author_sort | Shao, Li |
collection | PubMed |
description | The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. |
format | Online Article Text |
id | pubmed-3702597 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37025972013-07-16 Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment Shao, Li Fan, Xiaohui Cheng, Ningtao Wu, Leihong Cheng, Yiyu PLoS One Research Article The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. Public Library of Science 2013-07-05 /pmc/articles/PMC3702597/ /pubmed/23861920 http://dx.doi.org/10.1371/journal.pone.0068579 Text en © 2013 Shao et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Shao, Li Fan, Xiaohui Cheng, Ningtao Wu, Leihong Cheng, Yiyu Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment |
title | Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment |
title_full | Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment |
title_fullStr | Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment |
title_full_unstemmed | Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment |
title_short | Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment |
title_sort | determination of minimum training sample size for microarray-based cancer outcome prediction–an empirical assessment |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702597/ https://www.ncbi.nlm.nih.gov/pubmed/23861920 http://dx.doi.org/10.1371/journal.pone.0068579 |
work_keys_str_mv | AT shaoli determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment AT fanxiaohui determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment AT chengningtao determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment AT wuleihong determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment AT chengyiyu determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment |