Cargando…
Unobserved classes and extra variables in high-dimensional discriminant analysis
In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924148/ https://www.ncbi.nlm.nih.gov/pubmed/35308632 http://dx.doi.org/10.1007/s11634-021-00474-3 |
_version_ | 1784669785790873600 |
---|---|
author | Fop, Michael Mattei, Pierre-Alexandre Bouveyron, Charles Murphy, Thomas Brendan |
author_facet | Fop, Michael Mattei, Pierre-Alexandre Bouveyron, Charles Murphy, Thomas Brendan |
author_sort | Fop, Michael |
collection | PubMed |
description | In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was collected. In this situation, the classifier built in the learning phase needs to adapt to handle potential unknown classes and the extra dimensions. We introduce a model-based discriminant approach, Dimension-Adaptive Mixture Discriminant Analysis (D-AMDA), which can detect unobserved classes and adapt to the increasing dimensionality. Model estimation is carried out via a full inductive approach based on an EM algorithm. The method is then embedded in a more general framework for adaptive variable selection and classification suitable for data of large dimensions. A simulation study and an artificial experiment related to classification of adulterated honey samples are used to validate the ability of the proposed framework to deal with complex situations. |
format | Online Article Text |
id | pubmed-8924148 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-89241482022-03-17 Unobserved classes and extra variables in high-dimensional discriminant analysis Fop, Michael Mattei, Pierre-Alexandre Bouveyron, Charles Murphy, Thomas Brendan Adv Data Anal Classif Regular Article In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was collected. In this situation, the classifier built in the learning phase needs to adapt to handle potential unknown classes and the extra dimensions. We introduce a model-based discriminant approach, Dimension-Adaptive Mixture Discriminant Analysis (D-AMDA), which can detect unobserved classes and adapt to the increasing dimensionality. Model estimation is carried out via a full inductive approach based on an EM algorithm. The method is then embedded in a more general framework for adaptive variable selection and classification suitable for data of large dimensions. A simulation study and an artificial experiment related to classification of adulterated honey samples are used to validate the ability of the proposed framework to deal with complex situations. Springer Berlin Heidelberg 2022-03-01 2022 /pmc/articles/PMC8924148/ /pubmed/35308632 http://dx.doi.org/10.1007/s11634-021-00474-3 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Regular Article Fop, Michael Mattei, Pierre-Alexandre Bouveyron, Charles Murphy, Thomas Brendan Unobserved classes and extra variables in high-dimensional discriminant analysis |
title | Unobserved classes and extra variables in high-dimensional discriminant analysis |
title_full | Unobserved classes and extra variables in high-dimensional discriminant analysis |
title_fullStr | Unobserved classes and extra variables in high-dimensional discriminant analysis |
title_full_unstemmed | Unobserved classes and extra variables in high-dimensional discriminant analysis |
title_short | Unobserved classes and extra variables in high-dimensional discriminant analysis |
title_sort | unobserved classes and extra variables in high-dimensional discriminant analysis |
topic | Regular Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924148/ https://www.ncbi.nlm.nih.gov/pubmed/35308632 http://dx.doi.org/10.1007/s11634-021-00474-3 |
work_keys_str_mv | AT fopmichael unobservedclassesandextravariablesinhighdimensionaldiscriminantanalysis AT matteipierrealexandre unobservedclassesandextravariablesinhighdimensionaldiscriminantanalysis AT bouveyroncharles unobservedclassesandextravariablesinhighdimensionaldiscriminantanalysis AT murphythomasbrendan unobservedclassesandextravariablesinhighdimensionaldiscriminantanalysis |