Cargando…

An Exploratory Study on the Complexity and Machine Learning Predictability of Stock Market Data

This paper shows if and how the predictability and complexity of stock market data changed over the last half-century and what influence the M1 money supply has. We use three different machine learning algorithms, i.e., a stochastic gradient descent linear regression, a lasso regression, and an XGBo...

Descripción completa

Detalles Bibliográficos
Autores principales: Raubitzek, Sebastian, Neubauer, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8947671/
https://www.ncbi.nlm.nih.gov/pubmed/35327843
http://dx.doi.org/10.3390/e24030332
Descripción
Sumario:This paper shows if and how the predictability and complexity of stock market data changed over the last half-century and what influence the M1 money supply has. We use three different machine learning algorithms, i.e., a stochastic gradient descent linear regression, a lasso regression, and an XGBoost tree regression, to test the predictability of two stock market indices, the Dow Jones Industrial Average and the NASDAQ (National Association of Securities Dealers Automated Quotations) Composite. In addition, all data under study are discussed in the context of a variety of measures of signal complexity. The results of this complexity analysis are then linked with the machine learning results to discover trends and correlations between predictability and complexity. Our results show a decrease in predictability and an increase in complexity for more recent years. We find a correlation between approximate entropy, sample entropy, and the predictability of the employed machine learning algorithms on the data under study. This link between the predictability of machine learning algorithms and the mentioned entropy measures has not been shown before. It should be considered when analyzing and predicting complex time series data, e.g., stock market data, to e.g., identify regions of increased predictability.