Cargando…

Security exchange commission forms K-10 filings – Positive and negative word occurrence dataset 1995–2008

Corporate disclosure became more descriptive rather than quantitative over time. Thus, textual analysis gained popularity in finance and business, however, it requires massive computing power. The paper presents the panel set of the raw frequencies of positive and negative words across 90,463 Forms...

Descripción completa

Detalles Bibliográficos
Autores principales: Staszkiewicz, Piotr, Staszkiewicz, Richard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9011009/
https://www.ncbi.nlm.nih.gov/pubmed/35434221
http://dx.doi.org/10.1016/j.dib.2022.108110
_version_ 1784687599414149120
author Staszkiewicz, Piotr
Staszkiewicz, Richard
author_facet Staszkiewicz, Piotr
Staszkiewicz, Richard
author_sort Staszkiewicz, Piotr
collection PubMed
description Corporate disclosure became more descriptive rather than quantitative over time. Thus, textual analysis gained popularity in finance and business, however, it requires massive computing power. The paper presents the panel set of the raw frequencies of positive and negative words across 90,463 Forms 10-K filed at Security Exchange Commission (SEC) in EDGAR (the Electronic Data Gathering, Analysis, and Retrieval system) over the period 1995–2008. The dataset consists of 456 variables. The texts of the forms were retrieved from the SEC servers and processed using text mining techniques. The data relevant for archive analysis on the sentiment of the financial statements and financial reporting on SEC registrants. Potential reuse for creation of the tone or sentiments indexes. Long-time data series allows for dynamic analysis. The data set allows reducing the computer power requirements for further research.
format Online
Article
Text
id pubmed-9011009
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-90110092022-04-16 Security exchange commission forms K-10 filings – Positive and negative word occurrence dataset 1995–2008 Staszkiewicz, Piotr Staszkiewicz, Richard Data Brief Data Article Corporate disclosure became more descriptive rather than quantitative over time. Thus, textual analysis gained popularity in finance and business, however, it requires massive computing power. The paper presents the panel set of the raw frequencies of positive and negative words across 90,463 Forms 10-K filed at Security Exchange Commission (SEC) in EDGAR (the Electronic Data Gathering, Analysis, and Retrieval system) over the period 1995–2008. The dataset consists of 456 variables. The texts of the forms were retrieved from the SEC servers and processed using text mining techniques. The data relevant for archive analysis on the sentiment of the financial statements and financial reporting on SEC registrants. Potential reuse for creation of the tone or sentiments indexes. Long-time data series allows for dynamic analysis. The data set allows reducing the computer power requirements for further research. Elsevier 2022-03-29 /pmc/articles/PMC9011009/ /pubmed/35434221 http://dx.doi.org/10.1016/j.dib.2022.108110 Text en © 2022 The Author(s). Published by Elsevier Inc. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Staszkiewicz, Piotr
Staszkiewicz, Richard
Security exchange commission forms K-10 filings – Positive and negative word occurrence dataset 1995–2008
title Security exchange commission forms K-10 filings – Positive and negative word occurrence dataset 1995–2008
title_full Security exchange commission forms K-10 filings – Positive and negative word occurrence dataset 1995–2008
title_fullStr Security exchange commission forms K-10 filings – Positive and negative word occurrence dataset 1995–2008
title_full_unstemmed Security exchange commission forms K-10 filings – Positive and negative word occurrence dataset 1995–2008
title_short Security exchange commission forms K-10 filings – Positive and negative word occurrence dataset 1995–2008
title_sort security exchange commission forms k-10 filings – positive and negative word occurrence dataset 1995–2008
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9011009/
https://www.ncbi.nlm.nih.gov/pubmed/35434221
http://dx.doi.org/10.1016/j.dib.2022.108110
work_keys_str_mv AT staszkiewiczpiotr securityexchangecommissionformsk10filingspositiveandnegativewordoccurrencedataset19952008
AT staszkiewiczrichard securityexchangecommissionformsk10filingspositiveandnegativewordoccurrencedataset19952008