Cargando…

Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data

BACKGROUND: Gene shaving (GS) is an essential and challenging tools for biomedical researchers due to the large number of genes in human genome and the complex nature of biological networks. Most GS methods are not applicable to non-linear and multi-view data sets. While the kernel based methods can...

Descripción completa

Detalles Bibliográficos
Autores principales: Alam, Md. Ashad, Shahjaman, Mohammd, Rahman, Md. Ferdush, Hossain, Fokhrul, Deng, Hong-Wen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6532884/
https://www.ncbi.nlm.nih.gov/pubmed/31120939
http://dx.doi.org/10.1371/journal.pone.0217027
_version_ 1783421082511867904
author Alam, Md. Ashad
Shahjaman, Mohammd
Rahman, Md. Ferdush
Hossain, Fokhrul
Deng, Hong-Wen
author_facet Alam, Md. Ashad
Shahjaman, Mohammd
Rahman, Md. Ferdush
Hossain, Fokhrul
Deng, Hong-Wen
author_sort Alam, Md. Ashad
collection PubMed
description BACKGROUND: Gene shaving (GS) is an essential and challenging tools for biomedical researchers due to the large number of genes in human genome and the complex nature of biological networks. Most GS methods are not applicable to non-linear and multi-view data sets. While the kernel based methods can overcome these problems, a well-founded positive definite kernel based GS method has yet to be proposed for biomedical data analysis. METHODS AND FINDINGS: Since the kernel based methods on genomic information can improve the prediction of diseases, here we proposed a noble method, “kernel based gene shaving” which is based on the influence function of kernel canonical correlation analysis. To investigate the performance of the proposed method in comparison to state-of-the-art-method in gene saving, we analyzed extensive simulated and real microarray gene expression data set. The performance metrics including true positive rate, true negative rate, false positive rate, false negative rate, misclassification error rate, the false discovery rate and area under curves were computed for each methods. In colon cancer data analysis, the proposed method identified a significant subsets of 210 genes out of 2000 genes and suggestive superior performance compared with other methods. The proposed method can be applied to the study of other disease process where two view data is a common task. CONCLUSIONS: We addressed the challenge of finding unique kernel based GS methods by using the influence function of kernel canonical correlation analysis. The proposed method has shown to have better performance than state-of-the-art-methods in gene saving and has identified many more significant gene interactions, suggesting that genes function in a concerted effort in colon cancer. In similar biomedical data analysis, kernel based methods could be applied to select a potential subset of genes. The positive definite kernel based methods can overcome the non-linearity problem and improve the prediction process.
format Online
Article
Text
id pubmed-6532884
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-65328842019-06-05 Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data Alam, Md. Ashad Shahjaman, Mohammd Rahman, Md. Ferdush Hossain, Fokhrul Deng, Hong-Wen PLoS One Research Article BACKGROUND: Gene shaving (GS) is an essential and challenging tools for biomedical researchers due to the large number of genes in human genome and the complex nature of biological networks. Most GS methods are not applicable to non-linear and multi-view data sets. While the kernel based methods can overcome these problems, a well-founded positive definite kernel based GS method has yet to be proposed for biomedical data analysis. METHODS AND FINDINGS: Since the kernel based methods on genomic information can improve the prediction of diseases, here we proposed a noble method, “kernel based gene shaving” which is based on the influence function of kernel canonical correlation analysis. To investigate the performance of the proposed method in comparison to state-of-the-art-method in gene saving, we analyzed extensive simulated and real microarray gene expression data set. The performance metrics including true positive rate, true negative rate, false positive rate, false negative rate, misclassification error rate, the false discovery rate and area under curves were computed for each methods. In colon cancer data analysis, the proposed method identified a significant subsets of 210 genes out of 2000 genes and suggestive superior performance compared with other methods. The proposed method can be applied to the study of other disease process where two view data is a common task. CONCLUSIONS: We addressed the challenge of finding unique kernel based GS methods by using the influence function of kernel canonical correlation analysis. The proposed method has shown to have better performance than state-of-the-art-methods in gene saving and has identified many more significant gene interactions, suggesting that genes function in a concerted effort in colon cancer. In similar biomedical data analysis, kernel based methods could be applied to select a potential subset of genes. The positive definite kernel based methods can overcome the non-linearity problem and improve the prediction process. Public Library of Science 2019-05-23 /pmc/articles/PMC6532884/ /pubmed/31120939 http://dx.doi.org/10.1371/journal.pone.0217027 Text en © 2019 Alam et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Alam, Md. Ashad
Shahjaman, Mohammd
Rahman, Md. Ferdush
Hossain, Fokhrul
Deng, Hong-Wen
Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data
title Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data
title_full Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data
title_fullStr Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data
title_full_unstemmed Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data
title_short Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data
title_sort gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6532884/
https://www.ncbi.nlm.nih.gov/pubmed/31120939
http://dx.doi.org/10.1371/journal.pone.0217027
work_keys_str_mv AT alammdashad geneshavingusingasensitivityanalysisofkernelbasedmachinelearningapproachwithapplicationstocancerdata
AT shahjamanmohammd geneshavingusingasensitivityanalysisofkernelbasedmachinelearningapproachwithapplicationstocancerdata
AT rahmanmdferdush geneshavingusingasensitivityanalysisofkernelbasedmachinelearningapproachwithapplicationstocancerdata
AT hossainfokhrul geneshavingusingasensitivityanalysisofkernelbasedmachinelearningapproachwithapplicationstocancerdata
AT denghongwen geneshavingusingasensitivityanalysisofkernelbasedmachinelearningapproachwithapplicationstocancerdata