Cargando…

Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data

BACKGROUND: A recent publication described a supervised classification method for microarray data: Between Group Analysis (BGA). This method which is based on performing multivariate ordination of groups proved to be very efficient for both classification of samples into pre-defined groups and disea...

Descripción completa

Detalles Bibliográficos
Autores principales: Baty, Florent, Bihl, Michel P, Perrière, Guy, Culhane, Aedín C, Brutsche, Martin H
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1261161/
https://www.ncbi.nlm.nih.gov/pubmed/16191195
http://dx.doi.org/10.1186/1471-2105-6-239
_version_ 1782125866773381120
author Baty, Florent
Bihl, Michel P
Perrière, Guy
Culhane, Aedín C
Brutsche, Martin H
author_facet Baty, Florent
Bihl, Michel P
Perrière, Guy
Culhane, Aedín C
Brutsche, Martin H
author_sort Baty, Florent
collection PubMed
description BACKGROUND: A recent publication described a supervised classification method for microarray data: Between Group Analysis (BGA). This method which is based on performing multivariate ordination of groups proved to be very efficient for both classification of samples into pre-defined groups and disease class prediction of new unknown samples. Classification and prediction with BGA are classically performed using the whole set of genes and no variable selection is required. We hypothesize that an optimized selection of highly discriminating genes might improve the prediction power of BGA. RESULTS: We propose an optimized between-group classification (OBC) which uses a jackknife-based gene selection procedure. OBC emphasizes classification accuracy rather than feature selection. OBC is a backward optimization procedure that maximizes the percentage of between group inertia by removing the least influential genes one by one from the analysis. This selects a subset of highly discriminative genes which optimize disease class prediction. We apply OBC to four datasets and compared it to other classification methods. CONCLUSION: OBC considerably improved the classification and predictive accuracy of BGA, when assessed using independent data sets and leave-one-out cross-validation. AVAILABILITY: The R code is freely available [see Additional file 1] as well as supplementary information [see Additional file 2].
format Text
id pubmed-1261161
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-12611612005-10-22 Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data Baty, Florent Bihl, Michel P Perrière, Guy Culhane, Aedín C Brutsche, Martin H BMC Bioinformatics Methodology Article BACKGROUND: A recent publication described a supervised classification method for microarray data: Between Group Analysis (BGA). This method which is based on performing multivariate ordination of groups proved to be very efficient for both classification of samples into pre-defined groups and disease class prediction of new unknown samples. Classification and prediction with BGA are classically performed using the whole set of genes and no variable selection is required. We hypothesize that an optimized selection of highly discriminating genes might improve the prediction power of BGA. RESULTS: We propose an optimized between-group classification (OBC) which uses a jackknife-based gene selection procedure. OBC emphasizes classification accuracy rather than feature selection. OBC is a backward optimization procedure that maximizes the percentage of between group inertia by removing the least influential genes one by one from the analysis. This selects a subset of highly discriminative genes which optimize disease class prediction. We apply OBC to four datasets and compared it to other classification methods. CONCLUSION: OBC considerably improved the classification and predictive accuracy of BGA, when assessed using independent data sets and leave-one-out cross-validation. AVAILABILITY: The R code is freely available [see Additional file 1] as well as supplementary information [see Additional file 2]. BioMed Central 2005-09-28 /pmc/articles/PMC1261161/ /pubmed/16191195 http://dx.doi.org/10.1186/1471-2105-6-239 Text en Copyright © 2005 Baty et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Baty, Florent
Bihl, Michel P
Perrière, Guy
Culhane, Aedín C
Brutsche, Martin H
Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data
title Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data
title_full Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data
title_fullStr Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data
title_full_unstemmed Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data
title_short Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data
title_sort optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1261161/
https://www.ncbi.nlm.nih.gov/pubmed/16191195
http://dx.doi.org/10.1186/1471-2105-6-239
work_keys_str_mv AT batyflorent optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata
AT bihlmichelp optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata
AT perriereguy optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata
AT culhaneaedinc optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata
AT brutschemartinh optimizedbetweengroupclassificationanewjackknifebasedgeneselectionprocedureforgenomewideexpressiondata