Cargando…

HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models

Surrogate models are frequently used to replace costly engineering simulations. A single surrogate is frequently chosen based on previous experience or by fitting multiple surrogates and selecting one based on mean cross-validation errors. A novel stacking strategy will be presented in this paper. T...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ozelim, Luan Carlos de Sena Monteiro, Ribeiro, Dimas Betioli, Schiavon, José Antonio, Domingues, Vinicius Resende, de Queiroz, Paulo Ivo Braga
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2023
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10470931/ https://www.ncbi.nlm.nih.gov/pubmed/37651433 http://dx.doi.org/10.1371/journal.pone.0290331

_version_	1785099792612851712
author	Ozelim, Luan Carlos de Sena Monteiro Ribeiro, Dimas Betioli Schiavon, José Antonio Domingues, Vinicius Resende de Queiroz, Paulo Ivo Braga
author_facet	Ozelim, Luan Carlos de Sena Monteiro Ribeiro, Dimas Betioli Schiavon, José Antonio Domingues, Vinicius Resende de Queiroz, Paulo Ivo Braga
author_sort	Ozelim, Luan Carlos de Sena Monteiro
collection	PubMed
description	Surrogate models are frequently used to replace costly engineering simulations. A single surrogate is frequently chosen based on previous experience or by fitting multiple surrogates and selecting one based on mean cross-validation errors. A novel stacking strategy will be presented in this paper. This new strategy results from reinterpreting the model selection process based on the generalization error. For the first time, this problem is proposed to be translated into a well-studied financial problem: portfolio management and optimization. In short, it is demonstrated that the individual residues calculated by leave-one-out procedures are samples from a given random variable ϵ(i), whose second non-central moment is the i-th model’s generalization error. Thus, a stacking methodology based solely on evaluating the behavior of the linear combination of the random variables ϵ(i) is proposed. At first, several surrogate models are calibrated. The Directed Bubble Hierarchical Tree (DBHT) clustering algorithm is then used to determine which models are worth stacking. The stacking weights can be calculated using any financial approach to the portfolio optimization problem. This alternative understanding of the problem enables practitioners to use established financial methodologies to calculate the models’ weights, significantly improving the ensemble of models’ out-of-sample performance. A study case is carried out to demonstrate the applicability of the new methodology. Overall, a total of 124 models were trained using a specific dataset: 40 Machine Learning models and 84 Polynomial Chaos Expansion models (which considered 3 types of base random variables, 7 least square algorithms for fitting the up to fourth order expansion’s coefficients). Among those, 99 models could be fitted without convergence and other numerical issues. The DBHT algorithm with Pearson correlation distance and generalization error similarity was able to select a subgroup of 23 models from the 99 fitted ones, implying a reduction of about 77% in the total number of models, representing a good filtering scheme which still preserves diversity. Finally, it has been demonstrated that the weights obtained by building a Hierarchical Risk Parity (HPR) portfolio perform better for various input random variables, indicating better out-of-sample performance. In this way, an economic stacking strategy has demonstrated its worth in improving the out-of-sample capabilities of stacked models, which illustrates how the new understanding of model stacking methodologies may be useful.
format	Online Article Text
id	pubmed-10470931
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-104709312023-09-01 HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models Ozelim, Luan Carlos de Sena Monteiro Ribeiro, Dimas Betioli Schiavon, José Antonio Domingues, Vinicius Resende de Queiroz, Paulo Ivo Braga PLoS One Research Article Surrogate models are frequently used to replace costly engineering simulations. A single surrogate is frequently chosen based on previous experience or by fitting multiple surrogates and selecting one based on mean cross-validation errors. A novel stacking strategy will be presented in this paper. This new strategy results from reinterpreting the model selection process based on the generalization error. For the first time, this problem is proposed to be translated into a well-studied financial problem: portfolio management and optimization. In short, it is demonstrated that the individual residues calculated by leave-one-out procedures are samples from a given random variable ϵ(i), whose second non-central moment is the i-th model’s generalization error. Thus, a stacking methodology based solely on evaluating the behavior of the linear combination of the random variables ϵ(i) is proposed. At first, several surrogate models are calibrated. The Directed Bubble Hierarchical Tree (DBHT) clustering algorithm is then used to determine which models are worth stacking. The stacking weights can be calculated using any financial approach to the portfolio optimization problem. This alternative understanding of the problem enables practitioners to use established financial methodologies to calculate the models’ weights, significantly improving the ensemble of models’ out-of-sample performance. A study case is carried out to demonstrate the applicability of the new methodology. Overall, a total of 124 models were trained using a specific dataset: 40 Machine Learning models and 84 Polynomial Chaos Expansion models (which considered 3 types of base random variables, 7 least square algorithms for fitting the up to fourth order expansion’s coefficients). Among those, 99 models could be fitted without convergence and other numerical issues. The DBHT algorithm with Pearson correlation distance and generalization error similarity was able to select a subgroup of 23 models from the 99 fitted ones, implying a reduction of about 77% in the total number of models, representing a good filtering scheme which still preserves diversity. Finally, it has been demonstrated that the weights obtained by building a Hierarchical Risk Parity (HPR) portfolio perform better for various input random variables, indicating better out-of-sample performance. In this way, an economic stacking strategy has demonstrated its worth in improving the out-of-sample capabilities of stacked models, which illustrates how the new understanding of model stacking methodologies may be useful. Public Library of Science 2023-08-31 /pmc/articles/PMC10470931/ /pubmed/37651433 http://dx.doi.org/10.1371/journal.pone.0290331 Text en © 2023 Ozelim et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Ozelim, Luan Carlos de Sena Monteiro Ribeiro, Dimas Betioli Schiavon, José Antonio Domingues, Vinicius Resende de Queiroz, Paulo Ivo Braga HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models
title	HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models
title_full	HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models
title_fullStr	HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models
title_full_unstemmed	HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models
title_short	HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models
title_sort	hposs: a hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10470931/ https://www.ncbi.nlm.nih.gov/pubmed/37651433 http://dx.doi.org/10.1371/journal.pone.0290331
work_keys_str_mv	AT ozelimluancarlosdesenamonteiro hpossahierarchicalportfoliooptimizationstackingstrategytoreducethegeneralizationerrorofensemblesofmodels AT ribeirodimasbetioli hpossahierarchicalportfoliooptimizationstackingstrategytoreducethegeneralizationerrorofensemblesofmodels AT schiavonjoseantonio hpossahierarchicalportfoliooptimizationstackingstrategytoreducethegeneralizationerrorofensemblesofmodels AT dominguesviniciusresende hpossahierarchicalportfoliooptimizationstackingstrategytoreducethegeneralizationerrorofensemblesofmodels AT dequeirozpauloivobraga hpossahierarchicalportfoliooptimizationstackingstrategytoreducethegeneralizationerrorofensemblesofmodels

HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models

Ejemplares similares