Cargando…

Robust Model Selection for Classification of Microarrays

Recently, microarray-based cancer diagnosis systems have been increasingly investigated. However, cost reduction and reliability assurance of such diagnosis systems are still remaing problems in real clinical scenes. To reduce the cost, we need a supervised classifier involving the smallest number o...

Descripción completa

Detalles Bibliográficos
Autores principales: Suzuki, Ikumi, Takenouchi, Takashi, Ohira, Miki, Oba, Shigeyuki, Ishii, Shin
Formato: Texto
Lenguaje:English
Publicado: Libertas Academica 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2730179/
https://www.ncbi.nlm.nih.gov/pubmed/19718450
_version_ 1782170864423272448
author Suzuki, Ikumi
Takenouchi, Takashi
Ohira, Miki
Oba, Shigeyuki
Ishii, Shin
author_facet Suzuki, Ikumi
Takenouchi, Takashi
Ohira, Miki
Oba, Shigeyuki
Ishii, Shin
author_sort Suzuki, Ikumi
collection PubMed
description Recently, microarray-based cancer diagnosis systems have been increasingly investigated. However, cost reduction and reliability assurance of such diagnosis systems are still remaing problems in real clinical scenes. To reduce the cost, we need a supervised classifier involving the smallest number of genes, as long as the classifier is sufficiently reliable. To achieve a reliable classifier, we should assess candidate classifiers and select the best one. In the selection process of the best classifier, however, the assessment criterion must involve large variance because of limited number of samples and non-negligible observation noise. Therefore, even if a classifier with a very small number of genes exhibited the smallest leave-one-out cross-validation (LOO) error rate, it would not necessarily be reliable because classifiers based on a small number of genes tend to show large variance. We propose a robust model selection criterion, the min-max criterion, based on a resampling bootstrap simulation to assess the variance of estimation of classification error rates. We applied our assessment framework to four published real gene expression datasets and one synthetic dataset. We found that a state-of-the-art procedure, weighted voting classifiers with LOO criterion, had a non-negligible risk of selecting extremely poor classifiers and, on the other hand, that the new min-max criterion could eliminate that risk. These finding suggests that our criterion presents a safer procedure to design a practical cancer diagnosis system.
format Text
id pubmed-2730179
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-27301792009-08-28 Robust Model Selection for Classification of Microarrays Suzuki, Ikumi Takenouchi, Takashi Ohira, Miki Oba, Shigeyuki Ishii, Shin Cancer Inform Original Research Recently, microarray-based cancer diagnosis systems have been increasingly investigated. However, cost reduction and reliability assurance of such diagnosis systems are still remaing problems in real clinical scenes. To reduce the cost, we need a supervised classifier involving the smallest number of genes, as long as the classifier is sufficiently reliable. To achieve a reliable classifier, we should assess candidate classifiers and select the best one. In the selection process of the best classifier, however, the assessment criterion must involve large variance because of limited number of samples and non-negligible observation noise. Therefore, even if a classifier with a very small number of genes exhibited the smallest leave-one-out cross-validation (LOO) error rate, it would not necessarily be reliable because classifiers based on a small number of genes tend to show large variance. We propose a robust model selection criterion, the min-max criterion, based on a resampling bootstrap simulation to assess the variance of estimation of classification error rates. We applied our assessment framework to four published real gene expression datasets and one synthetic dataset. We found that a state-of-the-art procedure, weighted voting classifiers with LOO criterion, had a non-negligible risk of selecting extremely poor classifiers and, on the other hand, that the new min-max criterion could eliminate that risk. These finding suggests that our criterion presents a safer procedure to design a practical cancer diagnosis system. Libertas Academica 2009-06-25 /pmc/articles/PMC2730179/ /pubmed/19718450 Text en © 2009 The authors. http://creativecommons.org/licenses/by/3.0 This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Original Research
Suzuki, Ikumi
Takenouchi, Takashi
Ohira, Miki
Oba, Shigeyuki
Ishii, Shin
Robust Model Selection for Classification of Microarrays
title Robust Model Selection for Classification of Microarrays
title_full Robust Model Selection for Classification of Microarrays
title_fullStr Robust Model Selection for Classification of Microarrays
title_full_unstemmed Robust Model Selection for Classification of Microarrays
title_short Robust Model Selection for Classification of Microarrays
title_sort robust model selection for classification of microarrays
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2730179/
https://www.ncbi.nlm.nih.gov/pubmed/19718450
work_keys_str_mv AT suzukiikumi robustmodelselectionforclassificationofmicroarrays
AT takenouchitakashi robustmodelselectionforclassificationofmicroarrays
AT ohiramiki robustmodelselectionforclassificationofmicroarrays
AT obashigeyuki robustmodelselectionforclassificationofmicroarrays
AT ishiishin robustmodelselectionforclassificationofmicroarrays