Cargando…
Weighted K-means support vector machine for cancer prediction
To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM),...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4960100/ https://www.ncbi.nlm.nih.gov/pubmed/27512621 http://dx.doi.org/10.1186/s40064-016-2677-4 |
_version_ | 1782444481126072320 |
---|---|
author | Kim, SungHwan |
author_facet | Kim, SungHwan |
author_sort | Kim, SungHwan |
collection | PubMed |
description | To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40064-016-2677-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4960100 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-49601002016-08-10 Weighted K-means support vector machine for cancer prediction Kim, SungHwan Springerplus Methodology To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40064-016-2677-4) contains supplementary material, which is available to authorized users. Springer International Publishing 2016-07-25 /pmc/articles/PMC4960100/ /pubmed/27512621 http://dx.doi.org/10.1186/s40064-016-2677-4 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. |
spellingShingle | Methodology Kim, SungHwan Weighted K-means support vector machine for cancer prediction |
title | Weighted K-means support vector machine for cancer prediction |
title_full | Weighted K-means support vector machine for cancer prediction |
title_fullStr | Weighted K-means support vector machine for cancer prediction |
title_full_unstemmed | Weighted K-means support vector machine for cancer prediction |
title_short | Weighted K-means support vector machine for cancer prediction |
title_sort | weighted k-means support vector machine for cancer prediction |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4960100/ https://www.ncbi.nlm.nih.gov/pubmed/27512621 http://dx.doi.org/10.1186/s40064-016-2677-4 |
work_keys_str_mv | AT kimsunghwan weightedkmeanssupportvectormachineforcancerprediction |