Cargando…

Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions

Distance weighted discrimination (DWD) is an appealing classification method that is capable of overcoming data piling problems in high-dimensional settings. Especially when various sparsity structures are assumed in these settings, variable selection in multicategory classification poses great chal...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Tong, Wang, Yafei, Liu, Yi, Branton, William G., Asahchop, Eugene, Power, Christopher, Jiang, Bei, Kong, Linglong, Tang, Niansheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712546/
https://www.ncbi.nlm.nih.gov/pubmed/33287025
http://dx.doi.org/10.3390/e22111257
_version_ 1783618396083978240
author Su, Tong
Wang, Yafei
Liu, Yi
Branton, William G.
Asahchop, Eugene
Power, Christopher
Jiang, Bei
Kong, Linglong
Tang, Niansheng
author_facet Su, Tong
Wang, Yafei
Liu, Yi
Branton, William G.
Asahchop, Eugene
Power, Christopher
Jiang, Bei
Kong, Linglong
Tang, Niansheng
author_sort Su, Tong
collection PubMed
description Distance weighted discrimination (DWD) is an appealing classification method that is capable of overcoming data piling problems in high-dimensional settings. Especially when various sparsity structures are assumed in these settings, variable selection in multicategory classification poses great challenges. In this paper, we propose a multicategory generalized DWD (MgDWD) method that maintains intrinsic variable group structures during selection using a sparse group lasso penalty. Theoretically, we derive minimizer uniqueness for the penalized MgDWD loss function and consistency properties for the proposed classifier. We further develop an efficient algorithm based on the proximal operator to solve the optimization problem. The performance of MgDWD is evaluated using finite sample simulations and miRNA data from an HIV study.
format Online
Article
Text
id pubmed-7712546
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-77125462021-02-24 Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions Su, Tong Wang, Yafei Liu, Yi Branton, William G. Asahchop, Eugene Power, Christopher Jiang, Bei Kong, Linglong Tang, Niansheng Entropy (Basel) Article Distance weighted discrimination (DWD) is an appealing classification method that is capable of overcoming data piling problems in high-dimensional settings. Especially when various sparsity structures are assumed in these settings, variable selection in multicategory classification poses great challenges. In this paper, we propose a multicategory generalized DWD (MgDWD) method that maintains intrinsic variable group structures during selection using a sparse group lasso penalty. Theoretically, we derive minimizer uniqueness for the penalized MgDWD loss function and consistency properties for the proposed classifier. We further develop an efficient algorithm based on the proximal operator to solve the optimization problem. The performance of MgDWD is evaluated using finite sample simulations and miRNA data from an HIV study. MDPI 2020-11-05 /pmc/articles/PMC7712546/ /pubmed/33287025 http://dx.doi.org/10.3390/e22111257 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Su, Tong
Wang, Yafei
Liu, Yi
Branton, William G.
Asahchop, Eugene
Power, Christopher
Jiang, Bei
Kong, Linglong
Tang, Niansheng
Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions
title Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions
title_full Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions
title_fullStr Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions
title_full_unstemmed Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions
title_short Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions
title_sort sparse multicategory generalized distance weighted discrimination in ultra-high dimensions
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712546/
https://www.ncbi.nlm.nih.gov/pubmed/33287025
http://dx.doi.org/10.3390/e22111257
work_keys_str_mv AT sutong sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions
AT wangyafei sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions
AT liuyi sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions
AT brantonwilliamg sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions
AT asahchopeugene sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions
AT powerchristopher sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions
AT jiangbei sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions
AT konglinglong sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions
AT tangniansheng sparsemulticategorygeneralizeddistanceweighteddiscriminationinultrahighdimensions