Cargando…

Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method

BACKGROUND: A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can...

Descripción completa

Detalles Bibliográficos
Autores principales: Guan, Peng, Huang, Desheng, He, Miao, Zhou, Baosen
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2719616/
https://www.ncbi.nlm.nih.gov/pubmed/19615083
http://dx.doi.org/10.1186/1756-9966-28-103
_version_ 1782170078252367872
author Guan, Peng
Huang, Desheng
He, Miao
Zhou, Baosen
author_facet Guan, Peng
Huang, Desheng
He, Miao
Zhou, Baosen
author_sort Guan, Peng
collection PubMed
description BACKGROUND: A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. METHODS: Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. RESULTS: The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. CONCLUSION: The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology.
format Text
id pubmed-2719616
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27196162009-08-01 Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method Guan, Peng Huang, Desheng He, Miao Zhou, Baosen J Exp Clin Cancer Res Research BACKGROUND: A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. METHODS: Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. RESULTS: The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. CONCLUSION: The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology. BioMed Central 2009-07-18 /pmc/articles/PMC2719616/ /pubmed/19615083 http://dx.doi.org/10.1186/1756-9966-28-103 Text en Copyright © 2009 Guan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Guan, Peng
Huang, Desheng
He, Miao
Zhou, Baosen
Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method
title Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method
title_full Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method
title_fullStr Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method
title_full_unstemmed Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method
title_short Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method
title_sort lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2719616/
https://www.ncbi.nlm.nih.gov/pubmed/19615083
http://dx.doi.org/10.1186/1756-9966-28-103
work_keys_str_mv AT guanpeng lungcancergeneexpressiondatabaseanalysisincorporatingpriorknowledgewithsupportvectormachinebasedclassificationmethod
AT huangdesheng lungcancergeneexpressiondatabaseanalysisincorporatingpriorknowledgewithsupportvectormachinebasedclassificationmethod
AT hemiao lungcancergeneexpressiondatabaseanalysisincorporatingpriorknowledgewithsupportvectormachinebasedclassificationmethod
AT zhoubaosen lungcancergeneexpressiondatabaseanalysisincorporatingpriorknowledgewithsupportvectormachinebasedclassificationmethod