Cargando…
LADEC: The Large Database of English Compounds
The Large Database of English Compounds (LADEC) consists of over 8,000 English words that can be parsed into two constituents that are free morphemes, making it the largest existing database specifically for use in research on compound words. Both monomorphemic (e.g., wheel) and multimorphemic (e.g....
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6797637/ https://www.ncbi.nlm.nih.gov/pubmed/31347038 http://dx.doi.org/10.3758/s13428-019-01282-6 |
_version_ | 1783459873517731840 |
---|---|
author | Gagné, Christina L. Spalding, Thomas L. Schmidtke, Daniel |
author_facet | Gagné, Christina L. Spalding, Thomas L. Schmidtke, Daniel |
author_sort | Gagné, Christina L. |
collection | PubMed |
description | The Large Database of English Compounds (LADEC) consists of over 8,000 English words that can be parsed into two constituents that are free morphemes, making it the largest existing database specifically for use in research on compound words. Both monomorphemic (e.g., wheel) and multimorphemic (e.g., teacher) constituents were used. The items were selected from a range of sources, including CELEX, the English Lexicon Project, the British Lexicon Project, the British National Corpus, and Wordnet, and were hand-coded as compounds (e.g., snowball). Participants rated each compound in terms of how predictable its meaning is from its parts, as well as the extent to which each constituent retains its meaning in the compound. In addition, we obtained linguistic characteristics that might influence compound processing (e.g., frequency, family size, and bigram frequency). To show the usefulness of the database in investigating compound processing, we conducted a number of analyses that showed that compound processing is consistently affected by semantic transparency, as well as by many of the other variables included in LADEC. We also showed that the effects of the variables associated with the two constituents are not symmetric. In short, LADEC provides the opportunity for researchers to investigate a number of questions about compounds that have not been possible to investigate in the past, due to the lack of sufficiently large and robust datasets. In addition to directly allowing researchers to test hypotheses using the information included in LADEC, the database will contribute to future compound research by allowing better stimulus selection and matching. |
format | Online Article Text |
id | pubmed-6797637 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-67976372019-11-01 LADEC: The Large Database of English Compounds Gagné, Christina L. Spalding, Thomas L. Schmidtke, Daniel Behav Res Methods Article The Large Database of English Compounds (LADEC) consists of over 8,000 English words that can be parsed into two constituents that are free morphemes, making it the largest existing database specifically for use in research on compound words. Both monomorphemic (e.g., wheel) and multimorphemic (e.g., teacher) constituents were used. The items were selected from a range of sources, including CELEX, the English Lexicon Project, the British Lexicon Project, the British National Corpus, and Wordnet, and were hand-coded as compounds (e.g., snowball). Participants rated each compound in terms of how predictable its meaning is from its parts, as well as the extent to which each constituent retains its meaning in the compound. In addition, we obtained linguistic characteristics that might influence compound processing (e.g., frequency, family size, and bigram frequency). To show the usefulness of the database in investigating compound processing, we conducted a number of analyses that showed that compound processing is consistently affected by semantic transparency, as well as by many of the other variables included in LADEC. We also showed that the effects of the variables associated with the two constituents are not symmetric. In short, LADEC provides the opportunity for researchers to investigate a number of questions about compounds that have not been possible to investigate in the past, due to the lack of sufficiently large and robust datasets. In addition to directly allowing researchers to test hypotheses using the information included in LADEC, the database will contribute to future compound research by allowing better stimulus selection and matching. Springer US 2019-07-25 2019 /pmc/articles/PMC6797637/ /pubmed/31347038 http://dx.doi.org/10.3758/s13428-019-01282-6 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. |
spellingShingle | Article Gagné, Christina L. Spalding, Thomas L. Schmidtke, Daniel LADEC: The Large Database of English Compounds |
title | LADEC: The Large Database of English Compounds |
title_full | LADEC: The Large Database of English Compounds |
title_fullStr | LADEC: The Large Database of English Compounds |
title_full_unstemmed | LADEC: The Large Database of English Compounds |
title_short | LADEC: The Large Database of English Compounds |
title_sort | ladec: the large database of english compounds |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6797637/ https://www.ncbi.nlm.nih.gov/pubmed/31347038 http://dx.doi.org/10.3758/s13428-019-01282-6 |
work_keys_str_mv | AT gagnechristinal ladecthelargedatabaseofenglishcompounds AT spaldingthomasl ladecthelargedatabaseofenglishcompounds AT schmidtkedaniel ladecthelargedatabaseofenglishcompounds |