Cargando…

Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment

The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot...

Descripción completa

Detalles Bibliográficos
Autores principales: Shao, Li, Fan, Xiaohui, Cheng, Ningtao, Wu, Leihong, Cheng, Yiyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702597/
https://www.ncbi.nlm.nih.gov/pubmed/23861920
http://dx.doi.org/10.1371/journal.pone.0068579
_version_ 1782275841494876160
author Shao, Li
Fan, Xiaohui
Cheng, Ningtao
Wu, Leihong
Cheng, Yiyu
author_facet Shao, Li
Fan, Xiaohui
Cheng, Ningtao
Wu, Leihong
Cheng, Yiyu
author_sort Shao, Li
collection PubMed
description The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability.
format Online
Article
Text
id pubmed-3702597
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-37025972013-07-16 Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment Shao, Li Fan, Xiaohui Cheng, Ningtao Wu, Leihong Cheng, Yiyu PLoS One Research Article The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. Public Library of Science 2013-07-05 /pmc/articles/PMC3702597/ /pubmed/23861920 http://dx.doi.org/10.1371/journal.pone.0068579 Text en © 2013 Shao et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Shao, Li
Fan, Xiaohui
Cheng, Ningtao
Wu, Leihong
Cheng, Yiyu
Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment
title Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment
title_full Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment
title_fullStr Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment
title_full_unstemmed Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment
title_short Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment
title_sort determination of minimum training sample size for microarray-based cancer outcome prediction–an empirical assessment
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702597/
https://www.ncbi.nlm.nih.gov/pubmed/23861920
http://dx.doi.org/10.1371/journal.pone.0068579
work_keys_str_mv AT shaoli determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment
AT fanxiaohui determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment
AT chengningtao determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment
AT wuleihong determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment
AT chengyiyu determinationofminimumtrainingsamplesizeformicroarraybasedcanceroutcomepredictionanempiricalassessment