Cargando…

Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio

Feature selection in high dimensional gene expression datasets not only reduces the dimension of the data, but also the execution time and computational cost of the underlying classifier. The current study introduces a novel feature selection method called weighted signal to noise ratio (W(SNR)) by...

Descripción completa

Detalles Bibliográficos
Autores principales: Hamraz, Muhammad, Ali, Amjad, Mashwani, Wali Khan, Aldahmani, Saeed, Khan, Zardad
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10128961/
https://www.ncbi.nlm.nih.gov/pubmed/37098036
http://dx.doi.org/10.1371/journal.pone.0284619
_version_ 1785030626575908864
author Hamraz, Muhammad
Ali, Amjad
Mashwani, Wali Khan
Aldahmani, Saeed
Khan, Zardad
author_facet Hamraz, Muhammad
Ali, Amjad
Mashwani, Wali Khan
Aldahmani, Saeed
Khan, Zardad
author_sort Hamraz, Muhammad
collection PubMed
description Feature selection in high dimensional gene expression datasets not only reduces the dimension of the data, but also the execution time and computational cost of the underlying classifier. The current study introduces a novel feature selection method called weighted signal to noise ratio (W(SNR)) by exploiting the weights of features based on support vectors and signal to noise ratio, with an objective to identify the most informative genes in high dimensional classification problems. The combination of two state-of-the-art procedures enables the extration of the most informative genes. The corresponding weights of these procedures are then multiplied and arranged in decreasing order. Larger weight of a feature indicates its discriminatory power in classifying the tissue samples to their true classes. The current method is validated on eight gene expression datasets. Moreover, results of the proposed method (W(SNR)) are also compared with four well known feature selection methods. We found that the (W(SNR)) outperform the other competing methods on 6 out of 8 datasets. Box-plots and Bar-plots of the results of the proposed method and all the other methods are also constructed. The proposed method is further assessed on simulated data. Simulation analysis reveal that (W(SNR)) outperforms all the other methods included in the study.
format Online
Article
Text
id pubmed-10128961
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-101289612023-04-26 Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio Hamraz, Muhammad Ali, Amjad Mashwani, Wali Khan Aldahmani, Saeed Khan, Zardad PLoS One Research Article Feature selection in high dimensional gene expression datasets not only reduces the dimension of the data, but also the execution time and computational cost of the underlying classifier. The current study introduces a novel feature selection method called weighted signal to noise ratio (W(SNR)) by exploiting the weights of features based on support vectors and signal to noise ratio, with an objective to identify the most informative genes in high dimensional classification problems. The combination of two state-of-the-art procedures enables the extration of the most informative genes. The corresponding weights of these procedures are then multiplied and arranged in decreasing order. Larger weight of a feature indicates its discriminatory power in classifying the tissue samples to their true classes. The current method is validated on eight gene expression datasets. Moreover, results of the proposed method (W(SNR)) are also compared with four well known feature selection methods. We found that the (W(SNR)) outperform the other competing methods on 6 out of 8 datasets. Box-plots and Bar-plots of the results of the proposed method and all the other methods are also constructed. The proposed method is further assessed on simulated data. Simulation analysis reveal that (W(SNR)) outperforms all the other methods included in the study. Public Library of Science 2023-04-25 /pmc/articles/PMC10128961/ /pubmed/37098036 http://dx.doi.org/10.1371/journal.pone.0284619 Text en © 2023 Hamraz et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Hamraz, Muhammad
Ali, Amjad
Mashwani, Wali Khan
Aldahmani, Saeed
Khan, Zardad
Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_full Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_fullStr Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_full_unstemmed Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_short Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_sort feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10128961/
https://www.ncbi.nlm.nih.gov/pubmed/37098036
http://dx.doi.org/10.1371/journal.pone.0284619
work_keys_str_mv AT hamrazmuhammad featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio
AT aliamjad featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio
AT mashwaniwalikhan featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio
AT aldahmanisaeed featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio
AT khanzardad featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio