Cargando…

A filter-based feature selection approach for identifying potential biomarkers for lung cancer

BACKGROUND: Lung cancer is the leading cause of death from cancer in the world and its treatment is dependant on the type and stage of cancer detected in the patient. Molecular biomarkers that can characterize the cancer phenotype are thus a key tool in planning a therapeutic response. A common prot...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, In-Hee, Lushington, Gerald H, Visvanathan, Mahesh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3164604/
https://www.ncbi.nlm.nih.gov/pubmed/21884628
http://dx.doi.org/10.1186/2043-9113-1-11
_version_ 1782211050812211200
author Lee, In-Hee
Lushington, Gerald H
Visvanathan, Mahesh
author_facet Lee, In-Hee
Lushington, Gerald H
Visvanathan, Mahesh
author_sort Lee, In-Hee
collection PubMed
description BACKGROUND: Lung cancer is the leading cause of death from cancer in the world and its treatment is dependant on the type and stage of cancer detected in the patient. Molecular biomarkers that can characterize the cancer phenotype are thus a key tool in planning a therapeutic response. A common protocol for identifying such biomarkers is to employ genomic microarray analysis to find genes that show differential expression according to disease state or type. Data-mining techniques such as feature selection are often used to isolate, from among a large manifold of genes with differential expression, those specific genes whose differential expression patterns are of optimal value in phenotypic differentiation. One such technique, Biomarker Identifier (BMI), has been developed to identify features with the ability to distinguish between two data groups of interest, which is thus highly applicable for such studies. RESULTS: Microarray data with validated genes was used to evaluate the utility of BMI in identifying markers for lung cancer. This data set contains a set of 129 gene expression profiles from large-airway epithelial cells (60 samples from smokers with lung cancer and 69 from smokers without lung cancer) and 7 genes from this data have been confirmed to be differentially expressed by quantitative PCR. Using this data set, BMI was compared with various well-known feature selection methods and was found to be more successful than other methods in finding useful genes to classify cancerous samples. Also it is evident that genes selected by BMI (given the same number of genes and classification algorithms) showed better discriminative power than those from the original study. After pathway analysis on the selected genes by BMI, we have been able to correlate the selected genes with well-known cancer-related pathways. CONCLUSIONS: Our results show that BMI can be used to analyze microarray data and to find useful genes for classifying samples. Pathway analysis suggests that BMI is successful in identifying biomarker-quality cancer-related genes from the data.
format Online
Article
Text
id pubmed-3164604
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31646042011-09-02 A filter-based feature selection approach for identifying potential biomarkers for lung cancer Lee, In-Hee Lushington, Gerald H Visvanathan, Mahesh J Clin Bioinforma Research BACKGROUND: Lung cancer is the leading cause of death from cancer in the world and its treatment is dependant on the type and stage of cancer detected in the patient. Molecular biomarkers that can characterize the cancer phenotype are thus a key tool in planning a therapeutic response. A common protocol for identifying such biomarkers is to employ genomic microarray analysis to find genes that show differential expression according to disease state or type. Data-mining techniques such as feature selection are often used to isolate, from among a large manifold of genes with differential expression, those specific genes whose differential expression patterns are of optimal value in phenotypic differentiation. One such technique, Biomarker Identifier (BMI), has been developed to identify features with the ability to distinguish between two data groups of interest, which is thus highly applicable for such studies. RESULTS: Microarray data with validated genes was used to evaluate the utility of BMI in identifying markers for lung cancer. This data set contains a set of 129 gene expression profiles from large-airway epithelial cells (60 samples from smokers with lung cancer and 69 from smokers without lung cancer) and 7 genes from this data have been confirmed to be differentially expressed by quantitative PCR. Using this data set, BMI was compared with various well-known feature selection methods and was found to be more successful than other methods in finding useful genes to classify cancerous samples. Also it is evident that genes selected by BMI (given the same number of genes and classification algorithms) showed better discriminative power than those from the original study. After pathway analysis on the selected genes by BMI, we have been able to correlate the selected genes with well-known cancer-related pathways. CONCLUSIONS: Our results show that BMI can be used to analyze microarray data and to find useful genes for classifying samples. Pathway analysis suggests that BMI is successful in identifying biomarker-quality cancer-related genes from the data. BioMed Central 2011-03-21 /pmc/articles/PMC3164604/ /pubmed/21884628 http://dx.doi.org/10.1186/2043-9113-1-11 Text en Copyright ©2011 Lee et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Lee, In-Hee
Lushington, Gerald H
Visvanathan, Mahesh
A filter-based feature selection approach for identifying potential biomarkers for lung cancer
title A filter-based feature selection approach for identifying potential biomarkers for lung cancer
title_full A filter-based feature selection approach for identifying potential biomarkers for lung cancer
title_fullStr A filter-based feature selection approach for identifying potential biomarkers for lung cancer
title_full_unstemmed A filter-based feature selection approach for identifying potential biomarkers for lung cancer
title_short A filter-based feature selection approach for identifying potential biomarkers for lung cancer
title_sort filter-based feature selection approach for identifying potential biomarkers for lung cancer
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3164604/
https://www.ncbi.nlm.nih.gov/pubmed/21884628
http://dx.doi.org/10.1186/2043-9113-1-11
work_keys_str_mv AT leeinhee afilterbasedfeatureselectionapproachforidentifyingpotentialbiomarkersforlungcancer
AT lushingtongeraldh afilterbasedfeatureselectionapproachforidentifyingpotentialbiomarkersforlungcancer
AT visvanathanmahesh afilterbasedfeatureselectionapproachforidentifyingpotentialbiomarkersforlungcancer
AT leeinhee filterbasedfeatureselectionapproachforidentifyingpotentialbiomarkersforlungcancer
AT lushingtongeraldh filterbasedfeatureselectionapproachforidentifyingpotentialbiomarkersforlungcancer
AT visvanathanmahesh filterbasedfeatureselectionapproachforidentifyingpotentialbiomarkersforlungcancer