Cargando…

On the enrichment of time series with textual data for forecasting agricultural commodity prices

Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the pre...

Descripción completa

Detalles Bibliográficos
Autores principales: Reis Filho, Ivan José, Marcacini, Ricardo Marcondes, Rezende, Solange Oliveira
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9240644/
https://www.ncbi.nlm.nih.gov/pubmed/35782724
http://dx.doi.org/10.1016/j.mex.2022.101758
_version_ 1784737612559286272
author Reis Filho, Ivan José
Marcacini, Ricardo Marcondes
Rezende, Solange Oliveira
author_facet Reis Filho, Ivan José
Marcacini, Ricardo Marcondes
Rezende, Solange Oliveira
author_sort Reis Filho, Ivan José
collection PubMed
description Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the prediction of the variable values. Textual data can be a source of knowledge about external factors and is potentially helpful for time-series forecasting models. Some studies have presented text mining techniques to combine textual and time-series data. However, the existing representations have limitations, such as the curse of dimensionality and sparse data. This work investigates the finite use of domain-specific terms to investigate these problems by representing textual data with low dimensional space. We consider thirty-three keywords that are potentially important in the domain to enrich time-series using text mining techniques. Four regression models were applied to the representation proposed to predict the future daily price of corn and soybeans. The experimental setup considers a real market scenario, in which the daily sliding window strategy and step-forward forecast were used. The representation proposed has better accuracy in some forecasting scenarios. The results indicate that text data are a promising alternative for enriching time-series representations and reducing uncertainty forecasting models. • We show an approach to enriching time-series using domain-specific terms; • Representation proposed combines quantitative data with qualitative market factors; • Regression Models to learn a forecasting function from enriched time-series.
format Online
Article
Text
id pubmed-9240644
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-92406442022-06-30 On the enrichment of time series with textual data for forecasting agricultural commodity prices Reis Filho, Ivan José Marcacini, Ricardo Marcondes Rezende, Solange Oliveira MethodsX Method Article Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the prediction of the variable values. Textual data can be a source of knowledge about external factors and is potentially helpful for time-series forecasting models. Some studies have presented text mining techniques to combine textual and time-series data. However, the existing representations have limitations, such as the curse of dimensionality and sparse data. This work investigates the finite use of domain-specific terms to investigate these problems by representing textual data with low dimensional space. We consider thirty-three keywords that are potentially important in the domain to enrich time-series using text mining techniques. Four regression models were applied to the representation proposed to predict the future daily price of corn and soybeans. The experimental setup considers a real market scenario, in which the daily sliding window strategy and step-forward forecast were used. The representation proposed has better accuracy in some forecasting scenarios. The results indicate that text data are a promising alternative for enriching time-series representations and reducing uncertainty forecasting models. • We show an approach to enriching time-series using domain-specific terms; • Representation proposed combines quantitative data with qualitative market factors; • Regression Models to learn a forecasting function from enriched time-series. Elsevier 2022-06-17 /pmc/articles/PMC9240644/ /pubmed/35782724 http://dx.doi.org/10.1016/j.mex.2022.101758 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Method Article
Reis Filho, Ivan José
Marcacini, Ricardo Marcondes
Rezende, Solange Oliveira
On the enrichment of time series with textual data for forecasting agricultural commodity prices
title On the enrichment of time series with textual data for forecasting agricultural commodity prices
title_full On the enrichment of time series with textual data for forecasting agricultural commodity prices
title_fullStr On the enrichment of time series with textual data for forecasting agricultural commodity prices
title_full_unstemmed On the enrichment of time series with textual data for forecasting agricultural commodity prices
title_short On the enrichment of time series with textual data for forecasting agricultural commodity prices
title_sort on the enrichment of time series with textual data for forecasting agricultural commodity prices
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9240644/
https://www.ncbi.nlm.nih.gov/pubmed/35782724
http://dx.doi.org/10.1016/j.mex.2022.101758
work_keys_str_mv AT reisfilhoivanjose ontheenrichmentoftimeserieswithtextualdataforforecastingagriculturalcommodityprices
AT marcaciniricardomarcondes ontheenrichmentoftimeserieswithtextualdataforforecastingagriculturalcommodityprices
AT rezendesolangeoliveira ontheenrichmentoftimeserieswithtextualdataforforecastingagriculturalcommodityprices