Cargando…

FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier

Here, we propose a heuristic technique of data trimming for SVM termed FLOating Window Projective Separator (FloWPS), tailored for personalized predictions based on molecular data. This procedure can operate with high throughput genetic datasets like gene expression or mutation profiles. Its applica...

Descripción completa

Detalles Bibliográficos
Autores principales: Tkachev, Victor, Sorokin, Maxim, Mescheryakov, Artem, Simonov, Alexander, Garazha, Andrew, Buzdin, Anton, Muchnik, Ilya, Borisov, Nicolas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6341065/
https://www.ncbi.nlm.nih.gov/pubmed/30697229
http://dx.doi.org/10.3389/fgene.2018.00717
_version_ 1783388886489104384
author Tkachev, Victor
Sorokin, Maxim
Mescheryakov, Artem
Simonov, Alexander
Garazha, Andrew
Buzdin, Anton
Muchnik, Ilya
Borisov, Nicolas
author_facet Tkachev, Victor
Sorokin, Maxim
Mescheryakov, Artem
Simonov, Alexander
Garazha, Andrew
Buzdin, Anton
Muchnik, Ilya
Borisov, Nicolas
author_sort Tkachev, Victor
collection PubMed
description Here, we propose a heuristic technique of data trimming for SVM termed FLOating Window Projective Separator (FloWPS), tailored for personalized predictions based on molecular data. This procedure can operate with high throughput genetic datasets like gene expression or mutation profiles. Its application prevents SVM from extrapolation by excluding non-informative features. FloWPS requires training on the data for the individuals with known clinical outcomes to create a clinically relevant classifier. The genetic profiles linked with the outcomes are broken as usual into the training and validation datasets. The unique property of FloWPS is that irrelevant features in validation dataset that don’t have significant number of neighboring hits in the training dataset are removed from further analyses. Next, similarly to the k nearest neighbors (kNN) method, for each point of a validation dataset, FloWPS takes into account only the proximal points of the training dataset. Thus, for every point of a validation dataset, the training dataset is adjusted to form a floating window. FloWPS performance was tested on ten gene expression datasets for 992 cancer patients either responding or not on the different types of chemotherapy. We experimentally confirmed by leave-one-out cross-validation that FloWPS enables to significantly increase quality of a classifier built based on the classical SVM in most of the applications, particularly for polynomial kernels.
format Online
Article
Text
id pubmed-6341065
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-63410652019-01-29 FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier Tkachev, Victor Sorokin, Maxim Mescheryakov, Artem Simonov, Alexander Garazha, Andrew Buzdin, Anton Muchnik, Ilya Borisov, Nicolas Front Genet Genetics Here, we propose a heuristic technique of data trimming for SVM termed FLOating Window Projective Separator (FloWPS), tailored for personalized predictions based on molecular data. This procedure can operate with high throughput genetic datasets like gene expression or mutation profiles. Its application prevents SVM from extrapolation by excluding non-informative features. FloWPS requires training on the data for the individuals with known clinical outcomes to create a clinically relevant classifier. The genetic profiles linked with the outcomes are broken as usual into the training and validation datasets. The unique property of FloWPS is that irrelevant features in validation dataset that don’t have significant number of neighboring hits in the training dataset are removed from further analyses. Next, similarly to the k nearest neighbors (kNN) method, for each point of a validation dataset, FloWPS takes into account only the proximal points of the training dataset. Thus, for every point of a validation dataset, the training dataset is adjusted to form a floating window. FloWPS performance was tested on ten gene expression datasets for 992 cancer patients either responding or not on the different types of chemotherapy. We experimentally confirmed by leave-one-out cross-validation that FloWPS enables to significantly increase quality of a classifier built based on the classical SVM in most of the applications, particularly for polynomial kernels. Frontiers Media S.A. 2019-01-15 /pmc/articles/PMC6341065/ /pubmed/30697229 http://dx.doi.org/10.3389/fgene.2018.00717 Text en Copyright © 2019 Tkachev, Sorokin, Mescheryakov, Simonov, Garazha, Buzdin, Muchnik and Borisov. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Tkachev, Victor
Sorokin, Maxim
Mescheryakov, Artem
Simonov, Alexander
Garazha, Andrew
Buzdin, Anton
Muchnik, Ilya
Borisov, Nicolas
FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier
title FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier
title_full FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier
title_fullStr FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier
title_full_unstemmed FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier
title_short FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier
title_sort floating-window projective separator (flowps): a data trimming tool for support vector machines (svm) to improve robustness of the classifier
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6341065/
https://www.ncbi.nlm.nih.gov/pubmed/30697229
http://dx.doi.org/10.3389/fgene.2018.00717
work_keys_str_mv AT tkachevvictor floatingwindowprojectiveseparatorflowpsadatatrimmingtoolforsupportvectormachinessvmtoimproverobustnessoftheclassifier
AT sorokinmaxim floatingwindowprojectiveseparatorflowpsadatatrimmingtoolforsupportvectormachinessvmtoimproverobustnessoftheclassifier
AT mescheryakovartem floatingwindowprojectiveseparatorflowpsadatatrimmingtoolforsupportvectormachinessvmtoimproverobustnessoftheclassifier
AT simonovalexander floatingwindowprojectiveseparatorflowpsadatatrimmingtoolforsupportvectormachinessvmtoimproverobustnessoftheclassifier
AT garazhaandrew floatingwindowprojectiveseparatorflowpsadatatrimmingtoolforsupportvectormachinessvmtoimproverobustnessoftheclassifier
AT buzdinanton floatingwindowprojectiveseparatorflowpsadatatrimmingtoolforsupportvectormachinessvmtoimproverobustnessoftheclassifier
AT muchnikilya floatingwindowprojectiveseparatorflowpsadatatrimmingtoolforsupportvectormachinessvmtoimproverobustnessoftheclassifier
AT borisovnicolas floatingwindowprojectiveseparatorflowpsadatatrimmingtoolforsupportvectormachinessvmtoimproverobustnessoftheclassifier