Cargando…

Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions

In efforts to discover disease mechanisms and improve clinical diagnosis of tumors, it is useful to mine profiles for informative genes with definite biological meanings and to build robust classifiers with high precision. In this study, we developed a new method for tumor-gene selection, the Chi-sq...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Hongyan, Li, Lanzhi, Luo, Chao, Sun, Congwei, Chen, Yuan, Dai, Zhijun, Yuan, Zheming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4130026/
https://www.ncbi.nlm.nih.gov/pubmed/25140319
http://dx.doi.org/10.1155/2014/589290
_version_ 1782330286022852608
author Zhang, Hongyan
Li, Lanzhi
Luo, Chao
Sun, Congwei
Chen, Yuan
Dai, Zhijun
Yuan, Zheming
author_facet Zhang, Hongyan
Li, Lanzhi
Luo, Chao
Sun, Congwei
Chen, Yuan
Dai, Zhijun
Yuan, Zheming
author_sort Zhang, Hongyan
collection PubMed
description In efforts to discover disease mechanisms and improve clinical diagnosis of tumors, it is useful to mine profiles for informative genes with definite biological meanings and to build robust classifiers with high precision. In this study, we developed a new method for tumor-gene selection, the Chi-square test-based integrated rank gene and direct classifier (χ (2)-IRG-DC). First, we obtained the weighted integrated rank of gene importance from chi-square tests of single and pairwise gene interactions. Then, we sequentially introduced the ranked genes and removed redundant genes by using leave-one-out cross-validation of the chi-square test-based Direct Classifier (χ (2)-DC) within the training set to obtain informative genes. Finally, we determined the accuracy of independent test data by utilizing the genes obtained above with χ (2)-DC. Furthermore, we analyzed the robustness of χ (2)-IRG-DC by comparing the generalization performance of different models, the efficiency of different feature-selection methods, and the accuracy of different classifiers. An independent test of ten multiclass tumor gene-expression datasets showed that χ (2)-IRG-DC could efficiently control overfitting and had higher generalization performance. The informative genes selected by χ (2)-IRG-DC could dramatically improve the independent test precision of other classifiers; meanwhile, the informative genes selected by other feature selection methods also had good performance in χ (2)-DC.
format Online
Article
Text
id pubmed-4130026
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-41300262014-08-19 Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions Zhang, Hongyan Li, Lanzhi Luo, Chao Sun, Congwei Chen, Yuan Dai, Zhijun Yuan, Zheming Biomed Res Int Research Article In efforts to discover disease mechanisms and improve clinical diagnosis of tumors, it is useful to mine profiles for informative genes with definite biological meanings and to build robust classifiers with high precision. In this study, we developed a new method for tumor-gene selection, the Chi-square test-based integrated rank gene and direct classifier (χ (2)-IRG-DC). First, we obtained the weighted integrated rank of gene importance from chi-square tests of single and pairwise gene interactions. Then, we sequentially introduced the ranked genes and removed redundant genes by using leave-one-out cross-validation of the chi-square test-based Direct Classifier (χ (2)-DC) within the training set to obtain informative genes. Finally, we determined the accuracy of independent test data by utilizing the genes obtained above with χ (2)-DC. Furthermore, we analyzed the robustness of χ (2)-IRG-DC by comparing the generalization performance of different models, the efficiency of different feature-selection methods, and the accuracy of different classifiers. An independent test of ten multiclass tumor gene-expression datasets showed that χ (2)-IRG-DC could efficiently control overfitting and had higher generalization performance. The informative genes selected by χ (2)-IRG-DC could dramatically improve the independent test precision of other classifiers; meanwhile, the informative genes selected by other feature selection methods also had good performance in χ (2)-DC. Hindawi Publishing Corporation 2014 2014-07-23 /pmc/articles/PMC4130026/ /pubmed/25140319 http://dx.doi.org/10.1155/2014/589290 Text en Copyright © 2014 Hongyan Zhang et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhang, Hongyan
Li, Lanzhi
Luo, Chao
Sun, Congwei
Chen, Yuan
Dai, Zhijun
Yuan, Zheming
Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions
title Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions
title_full Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions
title_fullStr Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions
title_full_unstemmed Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions
title_short Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions
title_sort informative gene selection and direct classification of tumor based on chi-square test of pairwise gene interactions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4130026/
https://www.ncbi.nlm.nih.gov/pubmed/25140319
http://dx.doi.org/10.1155/2014/589290
work_keys_str_mv AT zhanghongyan informativegeneselectionanddirectclassificationoftumorbasedonchisquaretestofpairwisegeneinteractions
AT lilanzhi informativegeneselectionanddirectclassificationoftumorbasedonchisquaretestofpairwisegeneinteractions
AT luochao informativegeneselectionanddirectclassificationoftumorbasedonchisquaretestofpairwisegeneinteractions
AT suncongwei informativegeneselectionanddirectclassificationoftumorbasedonchisquaretestofpairwisegeneinteractions
AT chenyuan informativegeneselectionanddirectclassificationoftumorbasedonchisquaretestofpairwisegeneinteractions
AT daizhijun informativegeneselectionanddirectclassificationoftumorbasedonchisquaretestofpairwisegeneinteractions
AT yuanzheming informativegeneselectionanddirectclassificationoftumorbasedonchisquaretestofpairwisegeneinteractions