Cargando…
Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations
BACKGROUND: Type-Token Ratio (TTR), given its relatively simple hand computation, is one of the few LSA measures calculated by clinicians in everyday practice. However, it has significant well-documented shortcomings; these include instability as a function of sample size, and absence of clear devel...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9257278/ https://www.ncbi.nlm.nih.gov/pubmed/35814069 http://dx.doi.org/10.3389/fpsyg.2022.905789 |
_version_ | 1784741310796660736 |
---|---|
author | Yang, Ji Seung Rosvold, Carly Bernstein Ratner, Nan |
author_facet | Yang, Ji Seung Rosvold, Carly Bernstein Ratner, Nan |
author_sort | Yang, Ji Seung |
collection | PubMed |
description | BACKGROUND: Type-Token Ratio (TTR), given its relatively simple hand computation, is one of the few LSA measures calculated by clinicians in everyday practice. However, it has significant well-documented shortcomings; these include instability as a function of sample size, and absence of clear developmental profiles over early childhood. A variety of alternative measures of lexical diversity have been proposed; some, such as Number of Different Words/100 (NDW) can also be computed by hand. However, others, such as Vocabulary Diversity (VocD) and the Moving Average Type Token Ratio (MATTR) rely on complex resampling algorithms that cannot be conducted by hand. To date, no large-scale study of all four measures has evaluated how well any capture typical developmental trends over early childhood, or whether any reliably distinguish typical from atypical profiles of expressive child language ability. MATERIALS AND METHODS: We conducted linear and non-linear regression analyses for TTR, NDW, VocD, and MATTR scores for samples taken from 946 corpora from typically developing preschool children (ages 2–6 years), engaged in adult-child toy play, from the Child Language Data Exchange System (CHILDES). These were contrasted with 504 samples from children known to have delayed expressive language skills (total n = 1,454 samples). We also conducted a separate sub-analysis which examined possible contextual effects of sampling environment on lexical diversity. RESULTS: Only VocD showed significantly different mean scores between the typically -developing children and delayed developing children group. Using TTR would actually misdiagnose typical children and miss children with known language impairment. However, computation of VocD as a function of toy interactions was significant and emerges as a further caution in use of lexical diversity as a valid proxy index of children’s expressive vocabulary skill. DISCUSSION: This large scale statistical comparison of computer-implemented algorithms for expressive lexical profiles in young children with traditional, hand-calculated measures showed that only VocD met criteria for evidence-based use in LSA. However, VocD was impacted by sample elicitation context, suggesting that non-linguistic factors, such as engagement with elicitation props, contaminate estimates of spoken lexical skill in young children. Implications and suggested directions are discussed. |
format | Online Article Text |
id | pubmed-9257278 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-92572782022-07-07 Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations Yang, Ji Seung Rosvold, Carly Bernstein Ratner, Nan Front Psychol Psychology BACKGROUND: Type-Token Ratio (TTR), given its relatively simple hand computation, is one of the few LSA measures calculated by clinicians in everyday practice. However, it has significant well-documented shortcomings; these include instability as a function of sample size, and absence of clear developmental profiles over early childhood. A variety of alternative measures of lexical diversity have been proposed; some, such as Number of Different Words/100 (NDW) can also be computed by hand. However, others, such as Vocabulary Diversity (VocD) and the Moving Average Type Token Ratio (MATTR) rely on complex resampling algorithms that cannot be conducted by hand. To date, no large-scale study of all four measures has evaluated how well any capture typical developmental trends over early childhood, or whether any reliably distinguish typical from atypical profiles of expressive child language ability. MATERIALS AND METHODS: We conducted linear and non-linear regression analyses for TTR, NDW, VocD, and MATTR scores for samples taken from 946 corpora from typically developing preschool children (ages 2–6 years), engaged in adult-child toy play, from the Child Language Data Exchange System (CHILDES). These were contrasted with 504 samples from children known to have delayed expressive language skills (total n = 1,454 samples). We also conducted a separate sub-analysis which examined possible contextual effects of sampling environment on lexical diversity. RESULTS: Only VocD showed significantly different mean scores between the typically -developing children and delayed developing children group. Using TTR would actually misdiagnose typical children and miss children with known language impairment. However, computation of VocD as a function of toy interactions was significant and emerges as a further caution in use of lexical diversity as a valid proxy index of children’s expressive vocabulary skill. DISCUSSION: This large scale statistical comparison of computer-implemented algorithms for expressive lexical profiles in young children with traditional, hand-calculated measures showed that only VocD met criteria for evidence-based use in LSA. However, VocD was impacted by sample elicitation context, suggesting that non-linguistic factors, such as engagement with elicitation props, contaminate estimates of spoken lexical skill in young children. Implications and suggested directions are discussed. Frontiers Media S.A. 2022-06-22 /pmc/articles/PMC9257278/ /pubmed/35814069 http://dx.doi.org/10.3389/fpsyg.2022.905789 Text en Copyright © 2022 Yang, Rosvold and Bernstein Ratner. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Psychology Yang, Ji Seung Rosvold, Carly Bernstein Ratner, Nan Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations |
title | Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations |
title_full | Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations |
title_fullStr | Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations |
title_full_unstemmed | Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations |
title_short | Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations |
title_sort | measurement of lexical diversity in children’s spoken language: computational and conceptual considerations |
topic | Psychology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9257278/ https://www.ncbi.nlm.nih.gov/pubmed/35814069 http://dx.doi.org/10.3389/fpsyg.2022.905789 |
work_keys_str_mv | AT yangjiseung measurementoflexicaldiversityinchildrensspokenlanguagecomputationalandconceptualconsiderations AT rosvoldcarly measurementoflexicaldiversityinchildrensspokenlanguagecomputationalandconceptualconsiderations AT bernsteinratnernan measurementoflexicaldiversityinchildrensspokenlanguagecomputationalandconceptualconsiderations |