Cargando…
Robust Model Selection for Classification of Microarrays
Recently, microarray-based cancer diagnosis systems have been increasingly investigated. However, cost reduction and reliability assurance of such diagnosis systems are still remaing problems in real clinical scenes. To reduce the cost, we need a supervised classifier involving the smallest number o...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Libertas Academica
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2730179/ https://www.ncbi.nlm.nih.gov/pubmed/19718450 |
_version_ | 1782170864423272448 |
---|---|
author | Suzuki, Ikumi Takenouchi, Takashi Ohira, Miki Oba, Shigeyuki Ishii, Shin |
author_facet | Suzuki, Ikumi Takenouchi, Takashi Ohira, Miki Oba, Shigeyuki Ishii, Shin |
author_sort | Suzuki, Ikumi |
collection | PubMed |
description | Recently, microarray-based cancer diagnosis systems have been increasingly investigated. However, cost reduction and reliability assurance of such diagnosis systems are still remaing problems in real clinical scenes. To reduce the cost, we need a supervised classifier involving the smallest number of genes, as long as the classifier is sufficiently reliable. To achieve a reliable classifier, we should assess candidate classifiers and select the best one. In the selection process of the best classifier, however, the assessment criterion must involve large variance because of limited number of samples and non-negligible observation noise. Therefore, even if a classifier with a very small number of genes exhibited the smallest leave-one-out cross-validation (LOO) error rate, it would not necessarily be reliable because classifiers based on a small number of genes tend to show large variance. We propose a robust model selection criterion, the min-max criterion, based on a resampling bootstrap simulation to assess the variance of estimation of classification error rates. We applied our assessment framework to four published real gene expression datasets and one synthetic dataset. We found that a state-of-the-art procedure, weighted voting classifiers with LOO criterion, had a non-negligible risk of selecting extremely poor classifiers and, on the other hand, that the new min-max criterion could eliminate that risk. These finding suggests that our criterion presents a safer procedure to design a practical cancer diagnosis system. |
format | Text |
id | pubmed-2730179 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Libertas Academica |
record_format | MEDLINE/PubMed |
spelling | pubmed-27301792009-08-28 Robust Model Selection for Classification of Microarrays Suzuki, Ikumi Takenouchi, Takashi Ohira, Miki Oba, Shigeyuki Ishii, Shin Cancer Inform Original Research Recently, microarray-based cancer diagnosis systems have been increasingly investigated. However, cost reduction and reliability assurance of such diagnosis systems are still remaing problems in real clinical scenes. To reduce the cost, we need a supervised classifier involving the smallest number of genes, as long as the classifier is sufficiently reliable. To achieve a reliable classifier, we should assess candidate classifiers and select the best one. In the selection process of the best classifier, however, the assessment criterion must involve large variance because of limited number of samples and non-negligible observation noise. Therefore, even if a classifier with a very small number of genes exhibited the smallest leave-one-out cross-validation (LOO) error rate, it would not necessarily be reliable because classifiers based on a small number of genes tend to show large variance. We propose a robust model selection criterion, the min-max criterion, based on a resampling bootstrap simulation to assess the variance of estimation of classification error rates. We applied our assessment framework to four published real gene expression datasets and one synthetic dataset. We found that a state-of-the-art procedure, weighted voting classifiers with LOO criterion, had a non-negligible risk of selecting extremely poor classifiers and, on the other hand, that the new min-max criterion could eliminate that risk. These finding suggests that our criterion presents a safer procedure to design a practical cancer diagnosis system. Libertas Academica 2009-06-25 /pmc/articles/PMC2730179/ /pubmed/19718450 Text en © 2009 The authors. http://creativecommons.org/licenses/by/3.0 This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/). |
spellingShingle | Original Research Suzuki, Ikumi Takenouchi, Takashi Ohira, Miki Oba, Shigeyuki Ishii, Shin Robust Model Selection for Classification of Microarrays |
title | Robust Model Selection for Classification of Microarrays |
title_full | Robust Model Selection for Classification of Microarrays |
title_fullStr | Robust Model Selection for Classification of Microarrays |
title_full_unstemmed | Robust Model Selection for Classification of Microarrays |
title_short | Robust Model Selection for Classification of Microarrays |
title_sort | robust model selection for classification of microarrays |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2730179/ https://www.ncbi.nlm.nih.gov/pubmed/19718450 |
work_keys_str_mv | AT suzukiikumi robustmodelselectionforclassificationofmicroarrays AT takenouchitakashi robustmodelselectionforclassificationofmicroarrays AT ohiramiki robustmodelselectionforclassificationofmicroarrays AT obashigeyuki robustmodelselectionforclassificationofmicroarrays AT ishiishin robustmodelselectionforclassificationofmicroarrays |