Cargando…

LADEC: The Large Database of English Compounds

The Large Database of English Compounds (LADEC) consists of over 8,000 English words that can be parsed into two constituents that are free morphemes, making it the largest existing database specifically for use in research on compound words. Both monomorphemic (e.g., wheel) and multimorphemic (e.g....

Descripción completa

Detalles Bibliográficos
Autores principales: Gagné, Christina L., Spalding, Thomas L., Schmidtke, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6797637/
https://www.ncbi.nlm.nih.gov/pubmed/31347038
http://dx.doi.org/10.3758/s13428-019-01282-6
_version_ 1783459873517731840
author Gagné, Christina L.
Spalding, Thomas L.
Schmidtke, Daniel
author_facet Gagné, Christina L.
Spalding, Thomas L.
Schmidtke, Daniel
author_sort Gagné, Christina L.
collection PubMed
description The Large Database of English Compounds (LADEC) consists of over 8,000 English words that can be parsed into two constituents that are free morphemes, making it the largest existing database specifically for use in research on compound words. Both monomorphemic (e.g., wheel) and multimorphemic (e.g., teacher) constituents were used. The items were selected from a range of sources, including CELEX, the English Lexicon Project, the British Lexicon Project, the British National Corpus, and Wordnet, and were hand-coded as compounds (e.g., snowball). Participants rated each compound in terms of how predictable its meaning is from its parts, as well as the extent to which each constituent retains its meaning in the compound. In addition, we obtained linguistic characteristics that might influence compound processing (e.g., frequency, family size, and bigram frequency). To show the usefulness of the database in investigating compound processing, we conducted a number of analyses that showed that compound processing is consistently affected by semantic transparency, as well as by many of the other variables included in LADEC. We also showed that the effects of the variables associated with the two constituents are not symmetric. In short, LADEC provides the opportunity for researchers to investigate a number of questions about compounds that have not been possible to investigate in the past, due to the lack of sufficiently large and robust datasets. In addition to directly allowing researchers to test hypotheses using the information included in LADEC, the database will contribute to future compound research by allowing better stimulus selection and matching.
format Online
Article
Text
id pubmed-6797637
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-67976372019-11-01 LADEC: The Large Database of English Compounds Gagné, Christina L. Spalding, Thomas L. Schmidtke, Daniel Behav Res Methods Article The Large Database of English Compounds (LADEC) consists of over 8,000 English words that can be parsed into two constituents that are free morphemes, making it the largest existing database specifically for use in research on compound words. Both monomorphemic (e.g., wheel) and multimorphemic (e.g., teacher) constituents were used. The items were selected from a range of sources, including CELEX, the English Lexicon Project, the British Lexicon Project, the British National Corpus, and Wordnet, and were hand-coded as compounds (e.g., snowball). Participants rated each compound in terms of how predictable its meaning is from its parts, as well as the extent to which each constituent retains its meaning in the compound. In addition, we obtained linguistic characteristics that might influence compound processing (e.g., frequency, family size, and bigram frequency). To show the usefulness of the database in investigating compound processing, we conducted a number of analyses that showed that compound processing is consistently affected by semantic transparency, as well as by many of the other variables included in LADEC. We also showed that the effects of the variables associated with the two constituents are not symmetric. In short, LADEC provides the opportunity for researchers to investigate a number of questions about compounds that have not been possible to investigate in the past, due to the lack of sufficiently large and robust datasets. In addition to directly allowing researchers to test hypotheses using the information included in LADEC, the database will contribute to future compound research by allowing better stimulus selection and matching. Springer US 2019-07-25 2019 /pmc/articles/PMC6797637/ /pubmed/31347038 http://dx.doi.org/10.3758/s13428-019-01282-6 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Article
Gagné, Christina L.
Spalding, Thomas L.
Schmidtke, Daniel
LADEC: The Large Database of English Compounds
title LADEC: The Large Database of English Compounds
title_full LADEC: The Large Database of English Compounds
title_fullStr LADEC: The Large Database of English Compounds
title_full_unstemmed LADEC: The Large Database of English Compounds
title_short LADEC: The Large Database of English Compounds
title_sort ladec: the large database of english compounds
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6797637/
https://www.ncbi.nlm.nih.gov/pubmed/31347038
http://dx.doi.org/10.3758/s13428-019-01282-6
work_keys_str_mv AT gagnechristinal ladecthelargedatabaseofenglishcompounds
AT spaldingthomasl ladecthelargedatabaseofenglishcompounds
AT schmidtkedaniel ladecthelargedatabaseofenglishcompounds