Cargando…

Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data

BACKGROUND: High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. METHODS: Gene masking is im...

Descripción completa

Detalles Bibliográficos
Autores principales: Saini, Harsh, Lal, Sunil Pranit, Naidu, Vimal Vikash, Pickering, Vincel Wince, Singh, Gurmeet, Tsunoda, Tatsuhiko, Sharma, Alok
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260793/
https://www.ncbi.nlm.nih.gov/pubmed/28117659
http://dx.doi.org/10.1186/s12920-016-0233-2
_version_ 1782499471176761344
author Saini, Harsh
Lal, Sunil Pranit
Naidu, Vimal Vikash
Pickering, Vincel Wince
Singh, Gurmeet
Tsunoda, Tatsuhiko
Sharma, Alok
author_facet Saini, Harsh
Lal, Sunil Pranit
Naidu, Vimal Vikash
Pickering, Vincel Wince
Singh, Gurmeet
Tsunoda, Tatsuhiko
Sharma, Alok
author_sort Saini, Harsh
collection PubMed
description BACKGROUND: High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. METHODS: Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. RESULTS: This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. CONCLUSION: The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.
format Online
Article
Text
id pubmed-5260793
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52607932017-01-30 Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data Saini, Harsh Lal, Sunil Pranit Naidu, Vimal Vikash Pickering, Vincel Wince Singh, Gurmeet Tsunoda, Tatsuhiko Sharma, Alok BMC Med Genomics Research BACKGROUND: High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. METHODS: Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. RESULTS: This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. CONCLUSION: The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers. BioMed Central 2016-12-05 /pmc/articles/PMC5260793/ /pubmed/28117659 http://dx.doi.org/10.1186/s12920-016-0233-2 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Saini, Harsh
Lal, Sunil Pranit
Naidu, Vimal Vikash
Pickering, Vincel Wince
Singh, Gurmeet
Tsunoda, Tatsuhiko
Sharma, Alok
Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data
title Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data
title_full Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data
title_fullStr Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data
title_full_unstemmed Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data
title_short Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data
title_sort gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260793/
https://www.ncbi.nlm.nih.gov/pubmed/28117659
http://dx.doi.org/10.1186/s12920-016-0233-2
work_keys_str_mv AT sainiharsh genemaskingatechniquetoimproveaccuracyforcancerclassificationwithhighdimensionalityinmicroarraydata
AT lalsunilpranit genemaskingatechniquetoimproveaccuracyforcancerclassificationwithhighdimensionalityinmicroarraydata
AT naiduvimalvikash genemaskingatechniquetoimproveaccuracyforcancerclassificationwithhighdimensionalityinmicroarraydata
AT pickeringvincelwince genemaskingatechniquetoimproveaccuracyforcancerclassificationwithhighdimensionalityinmicroarraydata
AT singhgurmeet genemaskingatechniquetoimproveaccuracyforcancerclassificationwithhighdimensionalityinmicroarraydata
AT tsunodatatsuhiko genemaskingatechniquetoimproveaccuracyforcancerclassificationwithhighdimensionalityinmicroarraydata
AT sharmaalok genemaskingatechniquetoimproveaccuracyforcancerclassificationwithhighdimensionalityinmicroarraydata