Cargando…
Constraining classifiers in molecular analysis: invariance and robustness
Analysing molecular profiles requires the selection of classification models that can cope with the high dimensionality and variability of these data. Also, improper reference point choice and scaling pose additional challenges. Often model selection is somewhat guided by ad hoc simulations rather t...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7061712/ https://www.ncbi.nlm.nih.gov/pubmed/32019472 http://dx.doi.org/10.1098/rsif.2019.0612 |
_version_ | 1783504441484247040 |
---|---|
author | Lausser, Ludwig Szekely, Robin Klimmek, Attila Schmid, Florian Kestler, Hans A. |
author_facet | Lausser, Ludwig Szekely, Robin Klimmek, Attila Schmid, Florian Kestler, Hans A. |
author_sort | Lausser, Ludwig |
collection | PubMed |
description | Analysing molecular profiles requires the selection of classification models that can cope with the high dimensionality and variability of these data. Also, improper reference point choice and scaling pose additional challenges. Often model selection is somewhat guided by ad hoc simulations rather than by sophisticated considerations on the properties of a categorization model. Here, we derive and report four linked linear concept classes/models with distinct invariance properties for high-dimensional molecular classification. We can further show that these concept classes also form a half-order of complexity classes in terms of Vapnik–Chervonenkis dimensions, which also implies increased generalization abilities. We implemented support vector machines with these properties. Surprisingly, we were able to attain comparable or even superior generalization abilities to the standard linear one on the 27 investigated RNA-Seq and microarray datasets. Our results indicate that a priori chosen invariant models can replace ad hoc robustness analysis by interpretable and theoretically guaranteed properties in molecular categorization. |
format | Online Article Text |
id | pubmed-7061712 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | The Royal Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-70617122020-03-26 Constraining classifiers in molecular analysis: invariance and robustness Lausser, Ludwig Szekely, Robin Klimmek, Attila Schmid, Florian Kestler, Hans A. J R Soc Interface Life Sciences–Mathematics interface Analysing molecular profiles requires the selection of classification models that can cope with the high dimensionality and variability of these data. Also, improper reference point choice and scaling pose additional challenges. Often model selection is somewhat guided by ad hoc simulations rather than by sophisticated considerations on the properties of a categorization model. Here, we derive and report four linked linear concept classes/models with distinct invariance properties for high-dimensional molecular classification. We can further show that these concept classes also form a half-order of complexity classes in terms of Vapnik–Chervonenkis dimensions, which also implies increased generalization abilities. We implemented support vector machines with these properties. Surprisingly, we were able to attain comparable or even superior generalization abilities to the standard linear one on the 27 investigated RNA-Seq and microarray datasets. Our results indicate that a priori chosen invariant models can replace ad hoc robustness analysis by interpretable and theoretically guaranteed properties in molecular categorization. The Royal Society 2020-02 2020-02-05 /pmc/articles/PMC7061712/ /pubmed/32019472 http://dx.doi.org/10.1098/rsif.2019.0612 Text en © 2020 The Authors. http://creativecommons.org/licenses/by/4.0/ Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited. |
spellingShingle | Life Sciences–Mathematics interface Lausser, Ludwig Szekely, Robin Klimmek, Attila Schmid, Florian Kestler, Hans A. Constraining classifiers in molecular analysis: invariance and robustness |
title | Constraining classifiers in molecular analysis: invariance and robustness |
title_full | Constraining classifiers in molecular analysis: invariance and robustness |
title_fullStr | Constraining classifiers in molecular analysis: invariance and robustness |
title_full_unstemmed | Constraining classifiers in molecular analysis: invariance and robustness |
title_short | Constraining classifiers in molecular analysis: invariance and robustness |
title_sort | constraining classifiers in molecular analysis: invariance and robustness |
topic | Life Sciences–Mathematics interface |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7061712/ https://www.ncbi.nlm.nih.gov/pubmed/32019472 http://dx.doi.org/10.1098/rsif.2019.0612 |
work_keys_str_mv | AT lausserludwig constrainingclassifiersinmolecularanalysisinvarianceandrobustness AT szekelyrobin constrainingclassifiersinmolecularanalysisinvarianceandrobustness AT klimmekattila constrainingclassifiersinmolecularanalysisinvarianceandrobustness AT schmidflorian constrainingclassifiersinmolecularanalysisinvarianceandrobustness AT kestlerhansa constrainingclassifiersinmolecularanalysisinvarianceandrobustness |