Cargando…
Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data
BACKGROUND: A recent publication described a supervised classification method for microarray data: Between Group Analysis (BGA). This method which is based on performing multivariate ordination of groups proved to be very efficient for both classification of samples into pre-defined groups and disea...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1261161/ https://www.ncbi.nlm.nih.gov/pubmed/16191195 http://dx.doi.org/10.1186/1471-2105-6-239 |
_version_ | 1782125866773381120 |
---|---|
author | Baty, Florent Bihl, Michel P Perrière, Guy Culhane, Aedín C Brutsche, Martin H |
author_facet | Baty, Florent Bihl, Michel P Perrière, Guy Culhane, Aedín C Brutsche, Martin H |
author_sort | Baty, Florent |
collection | PubMed |
description | BACKGROUND: A recent publication described a supervised classification method for microarray data: Between Group Analysis (BGA). This method which is based on performing multivariate ordination of groups proved to be very efficient for both classification of samples into pre-defined groups and disease class prediction of new unknown samples. Classification and prediction with BGA are classically performed using the whole set of genes and no variable selection is required. We hypothesize that an optimized selection of highly discriminating genes might improve the prediction power of BGA. RESULTS: We propose an optimized between-group classification (OBC) which uses a jackknife-based gene selection procedure. OBC emphasizes classification accuracy rather than feature selection. OBC is a backward optimization procedure that maximizes the percentage of between group inertia by removing the least influential genes one by one from the analysis. This selects a subset of highly discriminative genes which optimize disease class prediction. We apply OBC to four datasets and compared it to other classification methods. CONCLUSION: OBC considerably improved the classification and predictive accuracy of BGA, when assessed using independent data sets and leave-one-out cross-validation. AVAILABILITY: The R code is freely available [see Additional file 1] as well as supplementary information [see Additional file 2]. |
format | Text |
id | pubmed-1261161 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-12611612005-10-22 Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data Baty, Florent Bihl, Michel P Perrière, Guy Culhane, Aedín C Brutsche, Martin H BMC Bioinformatics Methodology Article BACKGROUND: A recent publication described a supervised classification method for microarray data: Between Group Analysis (BGA). This method which is based on performing multivariate ordination of groups proved to be very efficient for both classification of samples into pre-defined groups and disease class prediction of new unknown samples. Classification and prediction with BGA are classically performed using the whole set of genes and no variable selection is required. We hypothesize that an optimized selection of highly discriminating genes might improve the prediction power of BGA. RESULTS: We propose an optimized between-group classification (OBC) which uses a jackknife-based gene selection procedure. OBC emphasizes classification accuracy rather than feature selection. OBC is a backward optimization procedure that maximizes the percentage of between group inertia by removing the least influential genes one by one from the analysis. This selects a subset of highly discriminative genes which optimize disease class prediction. We apply OBC to four datasets and compared it to other classification methods. CONCLUSION: OBC considerably improved the classification and predictive accuracy of BGA, when assessed using independent data sets and leave-one-out cross-validation. AVAILABILITY: The R code is freely available [see Additional file 1] as well as supplementary information [see Additional file 2]. BioMed Central 2005-09-28 /pmc/articles/PMC1261161/ /pubmed/16191195 http://dx.doi.org/10.1186/1471-2105-6-239 Text en Copyright © 2005 Baty et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Baty, Florent Bihl, Michel P Perrière, Guy Culhane, Aedín C Brutsche, Martin H Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data |
title | Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data |
title_full | Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data |
title_fullStr | Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data |
title_full_unstemmed | Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data |
title_short | Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data |
title_sort | optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1261161/ https://www.ncbi.nlm.nih.gov/pubmed/16191195 http://dx.doi.org/10.1186/1471-2105-6-239 |
work_keys_str_mv | AT batyflorent optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata AT bihlmichelp optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata AT perriereguy optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata AT culhaneaedinc optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata AT brutschemartinh optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata |