Cargando…

Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations

BACKGROUND: Type-Token Ratio (TTR), given its relatively simple hand computation, is one of the few LSA measures calculated by clinicians in everyday practice. However, it has significant well-documented shortcomings; these include instability as a function of sample size, and absence of clear devel...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Ji Seung, Rosvold, Carly, Bernstein Ratner, Nan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9257278/
https://www.ncbi.nlm.nih.gov/pubmed/35814069
http://dx.doi.org/10.3389/fpsyg.2022.905789
_version_ 1784741310796660736
author Yang, Ji Seung
Rosvold, Carly
Bernstein Ratner, Nan
author_facet Yang, Ji Seung
Rosvold, Carly
Bernstein Ratner, Nan
author_sort Yang, Ji Seung
collection PubMed
description BACKGROUND: Type-Token Ratio (TTR), given its relatively simple hand computation, is one of the few LSA measures calculated by clinicians in everyday practice. However, it has significant well-documented shortcomings; these include instability as a function of sample size, and absence of clear developmental profiles over early childhood. A variety of alternative measures of lexical diversity have been proposed; some, such as Number of Different Words/100 (NDW) can also be computed by hand. However, others, such as Vocabulary Diversity (VocD) and the Moving Average Type Token Ratio (MATTR) rely on complex resampling algorithms that cannot be conducted by hand. To date, no large-scale study of all four measures has evaluated how well any capture typical developmental trends over early childhood, or whether any reliably distinguish typical from atypical profiles of expressive child language ability. MATERIALS AND METHODS: We conducted linear and non-linear regression analyses for TTR, NDW, VocD, and MATTR scores for samples taken from 946 corpora from typically developing preschool children (ages 2–6 years), engaged in adult-child toy play, from the Child Language Data Exchange System (CHILDES). These were contrasted with 504 samples from children known to have delayed expressive language skills (total n = 1,454 samples). We also conducted a separate sub-analysis which examined possible contextual effects of sampling environment on lexical diversity. RESULTS: Only VocD showed significantly different mean scores between the typically -developing children and delayed developing children group. Using TTR would actually misdiagnose typical children and miss children with known language impairment. However, computation of VocD as a function of toy interactions was significant and emerges as a further caution in use of lexical diversity as a valid proxy index of children’s expressive vocabulary skill. DISCUSSION: This large scale statistical comparison of computer-implemented algorithms for expressive lexical profiles in young children with traditional, hand-calculated measures showed that only VocD met criteria for evidence-based use in LSA. However, VocD was impacted by sample elicitation context, suggesting that non-linguistic factors, such as engagement with elicitation props, contaminate estimates of spoken lexical skill in young children. Implications and suggested directions are discussed.
format Online
Article
Text
id pubmed-9257278
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-92572782022-07-07 Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations Yang, Ji Seung Rosvold, Carly Bernstein Ratner, Nan Front Psychol Psychology BACKGROUND: Type-Token Ratio (TTR), given its relatively simple hand computation, is one of the few LSA measures calculated by clinicians in everyday practice. However, it has significant well-documented shortcomings; these include instability as a function of sample size, and absence of clear developmental profiles over early childhood. A variety of alternative measures of lexical diversity have been proposed; some, such as Number of Different Words/100 (NDW) can also be computed by hand. However, others, such as Vocabulary Diversity (VocD) and the Moving Average Type Token Ratio (MATTR) rely on complex resampling algorithms that cannot be conducted by hand. To date, no large-scale study of all four measures has evaluated how well any capture typical developmental trends over early childhood, or whether any reliably distinguish typical from atypical profiles of expressive child language ability. MATERIALS AND METHODS: We conducted linear and non-linear regression analyses for TTR, NDW, VocD, and MATTR scores for samples taken from 946 corpora from typically developing preschool children (ages 2–6 years), engaged in adult-child toy play, from the Child Language Data Exchange System (CHILDES). These were contrasted with 504 samples from children known to have delayed expressive language skills (total n = 1,454 samples). We also conducted a separate sub-analysis which examined possible contextual effects of sampling environment on lexical diversity. RESULTS: Only VocD showed significantly different mean scores between the typically -developing children and delayed developing children group. Using TTR would actually misdiagnose typical children and miss children with known language impairment. However, computation of VocD as a function of toy interactions was significant and emerges as a further caution in use of lexical diversity as a valid proxy index of children’s expressive vocabulary skill. DISCUSSION: This large scale statistical comparison of computer-implemented algorithms for expressive lexical profiles in young children with traditional, hand-calculated measures showed that only VocD met criteria for evidence-based use in LSA. However, VocD was impacted by sample elicitation context, suggesting that non-linguistic factors, such as engagement with elicitation props, contaminate estimates of spoken lexical skill in young children. Implications and suggested directions are discussed. Frontiers Media S.A. 2022-06-22 /pmc/articles/PMC9257278/ /pubmed/35814069 http://dx.doi.org/10.3389/fpsyg.2022.905789 Text en Copyright © 2022 Yang, Rosvold and Bernstein Ratner. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychology
Yang, Ji Seung
Rosvold, Carly
Bernstein Ratner, Nan
Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations
title Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations
title_full Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations
title_fullStr Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations
title_full_unstemmed Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations
title_short Measurement of Lexical Diversity in Children’s Spoken Language: Computational and Conceptual Considerations
title_sort measurement of lexical diversity in children’s spoken language: computational and conceptual considerations
topic Psychology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9257278/
https://www.ncbi.nlm.nih.gov/pubmed/35814069
http://dx.doi.org/10.3389/fpsyg.2022.905789
work_keys_str_mv AT yangjiseung measurementoflexicaldiversityinchildrensspokenlanguagecomputationalandconceptualconsiderations
AT rosvoldcarly measurementoflexicaldiversityinchildrensspokenlanguagecomputationalandconceptualconsiderations
AT bernsteinratnernan measurementoflexicaldiversityinchildrensspokenlanguagecomputationalandconceptualconsiderations