Cargando…

Unsupervised Text Segmentation Predicts Eye Fixations During Reading

Words typically form the basis of psycholinguistic and computational linguistic studies about sentence processing. However, recent evidence shows the basic units during reading, i.e., the items in the mental lexicon, are not always words, but could also be sub-word and supra-word units. To recognize...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Jinbiao, van den Bosch, Antal, Frank, Stefan L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8905434/
https://www.ncbi.nlm.nih.gov/pubmed/35280234
http://dx.doi.org/10.3389/frai.2022.731615
_version_ 1784665185176256512
author Yang, Jinbiao
van den Bosch, Antal
Frank, Stefan L.
author_facet Yang, Jinbiao
van den Bosch, Antal
Frank, Stefan L.
author_sort Yang, Jinbiao
collection PubMed
description Words typically form the basis of psycholinguistic and computational linguistic studies about sentence processing. However, recent evidence shows the basic units during reading, i.e., the items in the mental lexicon, are not always words, but could also be sub-word and supra-word units. To recognize these units, human readers require a cognitive mechanism to learn and detect them. In this paper, we assume eye fixations during reading reveal the locations of the cognitive units, and that the cognitive units are analogous with the text units discovered by unsupervised segmentation models. We predict eye fixations by model-segmented units on both English and Dutch text. The results show the model-segmented units predict eye fixations better than word units. This finding suggests that the predictive performance of model-segmented units indicates their plausibility as cognitive units. The Less-is-Better (LiB) model, which finds the units that minimize both long-term and working memory load, offers advantages both in terms of prediction score and efficiency among alternative models. Our results also suggest that modeling the least-effort principle for the management of long-term and working memory can lead to inferring cognitive units. Overall, the study supports the theory that the mental lexicon stores not only words but also smaller and larger units, suggests that fixation locations during reading depend on these units, and shows that unsupervised segmentation models can discover these units.
format Online
Article
Text
id pubmed-8905434
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89054342022-03-10 Unsupervised Text Segmentation Predicts Eye Fixations During Reading Yang, Jinbiao van den Bosch, Antal Frank, Stefan L. Front Artif Intell Artificial Intelligence Words typically form the basis of psycholinguistic and computational linguistic studies about sentence processing. However, recent evidence shows the basic units during reading, i.e., the items in the mental lexicon, are not always words, but could also be sub-word and supra-word units. To recognize these units, human readers require a cognitive mechanism to learn and detect them. In this paper, we assume eye fixations during reading reveal the locations of the cognitive units, and that the cognitive units are analogous with the text units discovered by unsupervised segmentation models. We predict eye fixations by model-segmented units on both English and Dutch text. The results show the model-segmented units predict eye fixations better than word units. This finding suggests that the predictive performance of model-segmented units indicates their plausibility as cognitive units. The Less-is-Better (LiB) model, which finds the units that minimize both long-term and working memory load, offers advantages both in terms of prediction score and efficiency among alternative models. Our results also suggest that modeling the least-effort principle for the management of long-term and working memory can lead to inferring cognitive units. Overall, the study supports the theory that the mental lexicon stores not only words but also smaller and larger units, suggests that fixation locations during reading depend on these units, and shows that unsupervised segmentation models can discover these units. Frontiers Media S.A. 2022-02-23 /pmc/articles/PMC8905434/ /pubmed/35280234 http://dx.doi.org/10.3389/frai.2022.731615 Text en Copyright © 2022 Yang, van den Bosch and Frank. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Artificial Intelligence
Yang, Jinbiao
van den Bosch, Antal
Frank, Stefan L.
Unsupervised Text Segmentation Predicts Eye Fixations During Reading
title Unsupervised Text Segmentation Predicts Eye Fixations During Reading
title_full Unsupervised Text Segmentation Predicts Eye Fixations During Reading
title_fullStr Unsupervised Text Segmentation Predicts Eye Fixations During Reading
title_full_unstemmed Unsupervised Text Segmentation Predicts Eye Fixations During Reading
title_short Unsupervised Text Segmentation Predicts Eye Fixations During Reading
title_sort unsupervised text segmentation predicts eye fixations during reading
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8905434/
https://www.ncbi.nlm.nih.gov/pubmed/35280234
http://dx.doi.org/10.3389/frai.2022.731615
work_keys_str_mv AT yangjinbiao unsupervisedtextsegmentationpredictseyefixationsduringreading
AT vandenboschantal unsupervisedtextsegmentationpredictseyefixationsduringreading
AT frankstefanl unsupervisedtextsegmentationpredictseyefixationsduringreading