Cargando…

On the enrichment of time series with textual data for forecasting agricultural commodity prices

Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the pre...

Descripción completa

Detalles Bibliográficos
Autores principales: Reis Filho, Ivan José, Marcacini, Ricardo Marcondes, Rezende, Solange Oliveira
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9240644/
https://www.ncbi.nlm.nih.gov/pubmed/35782724
http://dx.doi.org/10.1016/j.mex.2022.101758
Descripción
Sumario:Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the prediction of the variable values. Textual data can be a source of knowledge about external factors and is potentially helpful for time-series forecasting models. Some studies have presented text mining techniques to combine textual and time-series data. However, the existing representations have limitations, such as the curse of dimensionality and sparse data. This work investigates the finite use of domain-specific terms to investigate these problems by representing textual data with low dimensional space. We consider thirty-three keywords that are potentially important in the domain to enrich time-series using text mining techniques. Four regression models were applied to the representation proposed to predict the future daily price of corn and soybeans. The experimental setup considers a real market scenario, in which the daily sliding window strategy and step-forward forecast were used. The representation proposed has better accuracy in some forecasting scenarios. The results indicate that text data are a promising alternative for enriching time-series representations and reducing uncertainty forecasting models. • We show an approach to enriching time-series using domain-specific terms; • Representation proposed combines quantitative data with qualitative market factors; • Regression Models to learn a forecasting function from enriched time-series.