Cargando…

A new regularized least squares support vector regression for gene selection

BACKGROUND: Selection of influential genes with microarray data often faces the difficulties of a large number of genes and a relatively small group of subjects. In addition to the curse of dimensionality, many gene selection methods weight the contribution from each individual subject equally. This...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Pei-Chun, Huang, Su-Yun, Chen, Wei J, Hsiao, Chuhsing K
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2669483/
https://www.ncbi.nlm.nih.gov/pubmed/19187562
http://dx.doi.org/10.1186/1471-2105-10-44
_version_ 1782166258790170624
author Chen, Pei-Chun
Huang, Su-Yun
Chen, Wei J
Hsiao, Chuhsing K
author_facet Chen, Pei-Chun
Huang, Su-Yun
Chen, Wei J
Hsiao, Chuhsing K
author_sort Chen, Pei-Chun
collection PubMed
description BACKGROUND: Selection of influential genes with microarray data often faces the difficulties of a large number of genes and a relatively small group of subjects. In addition to the curse of dimensionality, many gene selection methods weight the contribution from each individual subject equally. This equal-contribution assumption cannot account for the possible dependence among subjects who associate similarly to the disease, and may restrict the selection of influential genes. RESULTS: A novel approach to gene selection is proposed based on kernel similarities and kernel weights. We do not assume uniformity for subject contribution. Weights are calculated via regularized least squares support vector regression (RLS-SVR) of class levels on kernel similarities and are used to weight subject contribution. The cumulative sum of weighted expression levels are next ranked to select responsible genes. These procedures also work for multiclass classification. We demonstrate this algorithm on acute leukemia, colon cancer, small, round blue cell tumors of childhood, breast cancer, and lung cancer studies, using kernel Fisher discriminant analysis and support vector machines as classifiers. Other procedures are compared as well. CONCLUSION: This approach is easy to implement and fast in computation for both binary and multiclass problems. The gene set provided by the RLS-SVR weight-based approach contains a less number of genes, and achieves a higher accuracy than other procedures.
format Text
id pubmed-2669483
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26694832009-04-16 A new regularized least squares support vector regression for gene selection Chen, Pei-Chun Huang, Su-Yun Chen, Wei J Hsiao, Chuhsing K BMC Bioinformatics Methodology Article BACKGROUND: Selection of influential genes with microarray data often faces the difficulties of a large number of genes and a relatively small group of subjects. In addition to the curse of dimensionality, many gene selection methods weight the contribution from each individual subject equally. This equal-contribution assumption cannot account for the possible dependence among subjects who associate similarly to the disease, and may restrict the selection of influential genes. RESULTS: A novel approach to gene selection is proposed based on kernel similarities and kernel weights. We do not assume uniformity for subject contribution. Weights are calculated via regularized least squares support vector regression (RLS-SVR) of class levels on kernel similarities and are used to weight subject contribution. The cumulative sum of weighted expression levels are next ranked to select responsible genes. These procedures also work for multiclass classification. We demonstrate this algorithm on acute leukemia, colon cancer, small, round blue cell tumors of childhood, breast cancer, and lung cancer studies, using kernel Fisher discriminant analysis and support vector machines as classifiers. Other procedures are compared as well. CONCLUSION: This approach is easy to implement and fast in computation for both binary and multiclass problems. The gene set provided by the RLS-SVR weight-based approach contains a less number of genes, and achieves a higher accuracy than other procedures. BioMed Central 2009-02-03 /pmc/articles/PMC2669483/ /pubmed/19187562 http://dx.doi.org/10.1186/1471-2105-10-44 Text en Copyright © 2009 Chen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Chen, Pei-Chun
Huang, Su-Yun
Chen, Wei J
Hsiao, Chuhsing K
A new regularized least squares support vector regression for gene selection
title A new regularized least squares support vector regression for gene selection
title_full A new regularized least squares support vector regression for gene selection
title_fullStr A new regularized least squares support vector regression for gene selection
title_full_unstemmed A new regularized least squares support vector regression for gene selection
title_short A new regularized least squares support vector regression for gene selection
title_sort new regularized least squares support vector regression for gene selection
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2669483/
https://www.ncbi.nlm.nih.gov/pubmed/19187562
http://dx.doi.org/10.1186/1471-2105-10-44
work_keys_str_mv AT chenpeichun anewregularizedleastsquaressupportvectorregressionforgeneselection
AT huangsuyun anewregularizedleastsquaressupportvectorregressionforgeneselection
AT chenweij anewregularizedleastsquaressupportvectorregressionforgeneselection
AT hsiaochuhsingk anewregularizedleastsquaressupportvectorregressionforgeneselection
AT chenpeichun newregularizedleastsquaressupportvectorregressionforgeneselection
AT huangsuyun newregularizedleastsquaressupportvectorregressionforgeneselection
AT chenweij newregularizedleastsquaressupportvectorregressionforgeneselection
AT hsiaochuhsingk newregularizedleastsquaressupportvectorregressionforgeneselection