Cargando…

Psycholinguistic dataset on language use in 1145 novels published in English and Dutch

This dataset includes psycholinguistic data on 694 English-language and 451 Dutch-language novels, acquired with computerised analysis of digitised novels published mainly between 1800 and 2018. The English-language novels have a total word count of 66.9 million words, while the Dutch-language novel...

Descripción completa

Detalles Bibliográficos
Autores principales:	Luoto, Severi, van Cranenburgh, Andreas
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2020
Materias:	Data Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7772540/ https://www.ncbi.nlm.nih.gov/pubmed/33385024 http://dx.doi.org/10.1016/j.dib.2020.106655

_version_	1783629893675778048
author	Luoto, Severi van Cranenburgh, Andreas
author_facet	Luoto, Severi van Cranenburgh, Andreas
author_sort	Luoto, Severi
collection	PubMed
description	This dataset includes psycholinguistic data on 694 English-language and 451 Dutch-language novels, acquired with computerised analysis of digitised novels published mainly between 1800 and 2018. The English-language novels have a total word count of 66.9 million words, while the Dutch-language novels comprise 49.6 million words, therefore offering large, representative samples for both languages. The data provided in this article include 93 linguistic and psycholinguistic outcome variables for the English-language novels, acquired using Linguistic Inquiry and Word Count (LIWC) version 2015, and 68 linguistic and psycholinguistic outcome variables for the Dutch-language novels, acquired using Linguistic Inquiry and Word Count (LIWC) version 2001. The dataset also includes word frequencies (unigram and bigram) for each novel. The metadata for each novel include year of publication, authors’ nationality, sex, age at publication, and sexual orientation (the latter only in the English-language dataset), making it possible for researchers to study the data along these parameters. The use of these data can help researchers illuminate how word use reflects psychological processes in more than two centuries of literary art in English and in contemporary Dutch novels.
format	Online Article Text
id	pubmed-7772540
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-77725402020-12-30 Psycholinguistic dataset on language use in 1145 novels published in English and Dutch Luoto, Severi van Cranenburgh, Andreas Data Brief Data Article This dataset includes psycholinguistic data on 694 English-language and 451 Dutch-language novels, acquired with computerised analysis of digitised novels published mainly between 1800 and 2018. The English-language novels have a total word count of 66.9 million words, while the Dutch-language novels comprise 49.6 million words, therefore offering large, representative samples for both languages. The data provided in this article include 93 linguistic and psycholinguistic outcome variables for the English-language novels, acquired using Linguistic Inquiry and Word Count (LIWC) version 2015, and 68 linguistic and psycholinguistic outcome variables for the Dutch-language novels, acquired using Linguistic Inquiry and Word Count (LIWC) version 2001. The dataset also includes word frequencies (unigram and bigram) for each novel. The metadata for each novel include year of publication, authors’ nationality, sex, age at publication, and sexual orientation (the latter only in the English-language dataset), making it possible for researchers to study the data along these parameters. The use of these data can help researchers illuminate how word use reflects psychological processes in more than two centuries of literary art in English and in contemporary Dutch novels. Elsevier 2020-12-16 /pmc/articles/PMC7772540/ /pubmed/33385024 http://dx.doi.org/10.1016/j.dib.2020.106655 Text en © 2020 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Data Article Luoto, Severi van Cranenburgh, Andreas Psycholinguistic dataset on language use in 1145 novels published in English and Dutch
title	Psycholinguistic dataset on language use in 1145 novels published in English and Dutch
title_full	Psycholinguistic dataset on language use in 1145 novels published in English and Dutch
title_fullStr	Psycholinguistic dataset on language use in 1145 novels published in English and Dutch
title_full_unstemmed	Psycholinguistic dataset on language use in 1145 novels published in English and Dutch
title_short	Psycholinguistic dataset on language use in 1145 novels published in English and Dutch
title_sort	psycholinguistic dataset on language use in 1145 novels published in english and dutch
topic	Data Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7772540/ https://www.ncbi.nlm.nih.gov/pubmed/33385024 http://dx.doi.org/10.1016/j.dib.2020.106655
work_keys_str_mv	AT luotoseveri psycholinguisticdatasetonlanguageusein1145novelspublishedinenglishanddutch AT vancranenburghandreas psycholinguisticdatasetonlanguageusein1145novelspublishedinenglishanddutch

Psycholinguistic dataset on language use in 1145 novels published in English and Dutch

Ejemplares similares