Cargando…
CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data
BACKGROUND: For the last eight years, microarray-based classification has been a major topic in statistics, bioinformatics and biomedicine research. Traditional methods often yield unsatisfactory results or may even be inapplicable in the so-called "p ≫ n" setting where the number of predi...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2646186/ https://www.ncbi.nlm.nih.gov/pubmed/18925941 http://dx.doi.org/10.1186/1471-2105-9-439 |
_version_ | 1782164826620952576 |
---|---|
author | Slawski, M Daumer, M Boulesteix, A-L |
author_facet | Slawski, M Daumer, M Boulesteix, A-L |
author_sort | Slawski, M |
collection | PubMed |
description | BACKGROUND: For the last eight years, microarray-based classification has been a major topic in statistics, bioinformatics and biomedicine research. Traditional methods often yield unsatisfactory results or may even be inapplicable in the so-called "p ≫ n" setting where the number of predictors p by far exceeds the number of observations n, hence the term "ill-posed-problem". Careful model selection and evaluation satisfying accepted good-practice standards is a very complex task for statisticians without experience in this area or for scientists with limited statistical background. The multiplicity of available methods for class prediction based on high-dimensional data is an additional practical challenge for inexperienced researchers. RESULTS: In this article, we introduce a new Bioconductor package called CMA (standing for "Classification for MicroArrays") for automatically performing variable selection, parameter tuning, classifier construction, and unbiased evaluation of the constructed classifiers using a large number of usual methods. Without much time and effort, users are provided with an overview of the unbiased accuracy of most top-performing classifiers. Furthermore, the standardized evaluation framework underlying CMA can also be beneficial in statistical research for comparison purposes, for instance if a new classifier has to be compared to existing approaches. CONCLUSION: CMA is a user-friendly comprehensive package for classifier construction and evaluation implementing most usual approaches. It is freely available from the Bioconductor website at . |
format | Text |
id | pubmed-2646186 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26461862009-02-23 CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data Slawski, M Daumer, M Boulesteix, A-L BMC Bioinformatics Software BACKGROUND: For the last eight years, microarray-based classification has been a major topic in statistics, bioinformatics and biomedicine research. Traditional methods often yield unsatisfactory results or may even be inapplicable in the so-called "p ≫ n" setting where the number of predictors p by far exceeds the number of observations n, hence the term "ill-posed-problem". Careful model selection and evaluation satisfying accepted good-practice standards is a very complex task for statisticians without experience in this area or for scientists with limited statistical background. The multiplicity of available methods for class prediction based on high-dimensional data is an additional practical challenge for inexperienced researchers. RESULTS: In this article, we introduce a new Bioconductor package called CMA (standing for "Classification for MicroArrays") for automatically performing variable selection, parameter tuning, classifier construction, and unbiased evaluation of the constructed classifiers using a large number of usual methods. Without much time and effort, users are provided with an overview of the unbiased accuracy of most top-performing classifiers. Furthermore, the standardized evaluation framework underlying CMA can also be beneficial in statistical research for comparison purposes, for instance if a new classifier has to be compared to existing approaches. CONCLUSION: CMA is a user-friendly comprehensive package for classifier construction and evaluation implementing most usual approaches. It is freely available from the Bioconductor website at . BioMed Central 2008-10-16 /pmc/articles/PMC2646186/ /pubmed/18925941 http://dx.doi.org/10.1186/1471-2105-9-439 Text en Copyright © 2008 Slawski et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Slawski, M Daumer, M Boulesteix, A-L CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data |
title | CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data |
title_full | CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data |
title_fullStr | CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data |
title_full_unstemmed | CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data |
title_short | CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data |
title_sort | cma – a comprehensive bioconductor package for supervised classification with high dimensional data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2646186/ https://www.ncbi.nlm.nih.gov/pubmed/18925941 http://dx.doi.org/10.1186/1471-2105-9-439 |
work_keys_str_mv | AT slawskim cmaacomprehensivebioconductorpackageforsupervisedclassificationwithhighdimensionaldata AT daumerm cmaacomprehensivebioconductorpackageforsupervisedclassificationwithhighdimensionaldata AT boulesteixal cmaacomprehensivebioconductorpackageforsupervisedclassificationwithhighdimensionaldata |