Cargando…
On the enrichment of time series with textual data for forecasting agricultural commodity prices
Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the pre...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9240644/ https://www.ncbi.nlm.nih.gov/pubmed/35782724 http://dx.doi.org/10.1016/j.mex.2022.101758 |
_version_ | 1784737612559286272 |
---|---|
author | Reis Filho, Ivan José Marcacini, Ricardo Marcondes Rezende, Solange Oliveira |
author_facet | Reis Filho, Ivan José Marcacini, Ricardo Marcondes Rezende, Solange Oliveira |
author_sort | Reis Filho, Ivan José |
collection | PubMed |
description | Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the prediction of the variable values. Textual data can be a source of knowledge about external factors and is potentially helpful for time-series forecasting models. Some studies have presented text mining techniques to combine textual and time-series data. However, the existing representations have limitations, such as the curse of dimensionality and sparse data. This work investigates the finite use of domain-specific terms to investigate these problems by representing textual data with low dimensional space. We consider thirty-three keywords that are potentially important in the domain to enrich time-series using text mining techniques. Four regression models were applied to the representation proposed to predict the future daily price of corn and soybeans. The experimental setup considers a real market scenario, in which the daily sliding window strategy and step-forward forecast were used. The representation proposed has better accuracy in some forecasting scenarios. The results indicate that text data are a promising alternative for enriching time-series representations and reducing uncertainty forecasting models. • We show an approach to enriching time-series using domain-specific terms; • Representation proposed combines quantitative data with qualitative market factors; • Regression Models to learn a forecasting function from enriched time-series. |
format | Online Article Text |
id | pubmed-9240644 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-92406442022-06-30 On the enrichment of time series with textual data for forecasting agricultural commodity prices Reis Filho, Ivan José Marcacini, Ricardo Marcondes Rezende, Solange Oliveira MethodsX Method Article Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the prediction of the variable values. Textual data can be a source of knowledge about external factors and is potentially helpful for time-series forecasting models. Some studies have presented text mining techniques to combine textual and time-series data. However, the existing representations have limitations, such as the curse of dimensionality and sparse data. This work investigates the finite use of domain-specific terms to investigate these problems by representing textual data with low dimensional space. We consider thirty-three keywords that are potentially important in the domain to enrich time-series using text mining techniques. Four regression models were applied to the representation proposed to predict the future daily price of corn and soybeans. The experimental setup considers a real market scenario, in which the daily sliding window strategy and step-forward forecast were used. The representation proposed has better accuracy in some forecasting scenarios. The results indicate that text data are a promising alternative for enriching time-series representations and reducing uncertainty forecasting models. • We show an approach to enriching time-series using domain-specific terms; • Representation proposed combines quantitative data with qualitative market factors; • Regression Models to learn a forecasting function from enriched time-series. Elsevier 2022-06-17 /pmc/articles/PMC9240644/ /pubmed/35782724 http://dx.doi.org/10.1016/j.mex.2022.101758 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Method Article Reis Filho, Ivan José Marcacini, Ricardo Marcondes Rezende, Solange Oliveira On the enrichment of time series with textual data for forecasting agricultural commodity prices |
title | On the enrichment of time series with textual data for forecasting agricultural commodity prices |
title_full | On the enrichment of time series with textual data for forecasting agricultural commodity prices |
title_fullStr | On the enrichment of time series with textual data for forecasting agricultural commodity prices |
title_full_unstemmed | On the enrichment of time series with textual data for forecasting agricultural commodity prices |
title_short | On the enrichment of time series with textual data for forecasting agricultural commodity prices |
title_sort | on the enrichment of time series with textual data for forecasting agricultural commodity prices |
topic | Method Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9240644/ https://www.ncbi.nlm.nih.gov/pubmed/35782724 http://dx.doi.org/10.1016/j.mex.2022.101758 |
work_keys_str_mv | AT reisfilhoivanjose ontheenrichmentoftimeserieswithtextualdataforforecastingagriculturalcommodityprices AT marcaciniricardomarcondes ontheenrichmentoftimeserieswithtextualdataforforecastingagriculturalcommodityprices AT rezendesolangeoliveira ontheenrichmentoftimeserieswithtextualdataforforecastingagriculturalcommodityprices |