Cargando…

Distributional Measures of Semantic Abstraction

This article provides an in-depth study of distributional measures for distinguishing between degrees of semantic abstraction. Abstraction is considered a “central construct in cognitive science” (Barsalou, 2003) and a “process of information reduction that allows for efficient storage and retrieval...

Descripción completa

Detalles Bibliográficos
Autores principales:	Schulte im Walde, Sabine, Frassinelli, Diego
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8892386/ https://www.ncbi.nlm.nih.gov/pubmed/35252847 http://dx.doi.org/10.3389/frai.2021.796756

_version_	1784662155103043584
author	Schulte im Walde, Sabine Frassinelli, Diego
author_facet	Schulte im Walde, Sabine Frassinelli, Diego
author_sort	Schulte im Walde, Sabine
collection	PubMed
description	This article provides an in-depth study of distributional measures for distinguishing between degrees of semantic abstraction. Abstraction is considered a “central construct in cognitive science” (Barsalou, 2003) and a “process of information reduction that allows for efficient storage and retrieval of central knowledge” (Burgoon et al., 2013). Relying on the distributional hypothesis, computational studies have successfully exploited measures of contextual co-occurrence and neighbourhood density to distinguish between conceptual semantic categorisations. So far, these studies have modeled semantic abstraction across lexical-semantic tasks such as ambiguity; diachronic meaning changes; abstractness vs. concreteness; and hypernymy. Yet, the distributional approaches target different conceptual types of semantic relatedness, and as to our knowledge not much attention has been paid to apply, compare or analyse the computational abstraction measures across conceptual tasks. The current article suggests a novel perspective that exploits variants of distributional measures to investigate semantic abstraction in English in terms of the abstract–concrete dichotomy (e.g., glory–banana) and in terms of the generality–specificity distinction (e.g., animal–fish), in order to compare the strengths and weaknesses of the measures regarding categorisations of abstraction, and to determine and investigate conceptual differences. In a series of experiments we identify reliable distributional measures for both instantiations of lexical-semantic abstraction and reach a precision higher than 0.7, but the measures clearly differ for the abstract–concrete vs. abstract–specific distinctions and for nouns vs. verbs. Overall, we identify two groups of measures, (i) frequency and word entropy when distinguishing between more and less abstract words in terms of the generality–specificity distinction, and (ii) neighbourhood density variants (especially target–context diversity) when distinguishing between more and less abstract words in terms of the abstract–concrete dichotomy. We conclude that more general words are used more often and are less surprising than more specific words, and that abstract words establish themselves empirically in semantically more diverse contexts than concrete words. Finally, our experiments once more point out that distributional models of conceptual categorisations need to take word classes and ambiguity into account: results for nouns vs. verbs differ in many respects, and ambiguity hinders fine-tuning empirical observations.
format	Online Article Text
id	pubmed-8892386
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-88923862022-03-04 Distributional Measures of Semantic Abstraction Schulte im Walde, Sabine Frassinelli, Diego Front Artif Intell Artificial Intelligence This article provides an in-depth study of distributional measures for distinguishing between degrees of semantic abstraction. Abstraction is considered a “central construct in cognitive science” (Barsalou, 2003) and a “process of information reduction that allows for efficient storage and retrieval of central knowledge” (Burgoon et al., 2013). Relying on the distributional hypothesis, computational studies have successfully exploited measures of contextual co-occurrence and neighbourhood density to distinguish between conceptual semantic categorisations. So far, these studies have modeled semantic abstraction across lexical-semantic tasks such as ambiguity; diachronic meaning changes; abstractness vs. concreteness; and hypernymy. Yet, the distributional approaches target different conceptual types of semantic relatedness, and as to our knowledge not much attention has been paid to apply, compare or analyse the computational abstraction measures across conceptual tasks. The current article suggests a novel perspective that exploits variants of distributional measures to investigate semantic abstraction in English in terms of the abstract–concrete dichotomy (e.g., glory–banana) and in terms of the generality–specificity distinction (e.g., animal–fish), in order to compare the strengths and weaknesses of the measures regarding categorisations of abstraction, and to determine and investigate conceptual differences. In a series of experiments we identify reliable distributional measures for both instantiations of lexical-semantic abstraction and reach a precision higher than 0.7, but the measures clearly differ for the abstract–concrete vs. abstract–specific distinctions and for nouns vs. verbs. Overall, we identify two groups of measures, (i) frequency and word entropy when distinguishing between more and less abstract words in terms of the generality–specificity distinction, and (ii) neighbourhood density variants (especially target–context diversity) when distinguishing between more and less abstract words in terms of the abstract–concrete dichotomy. We conclude that more general words are used more often and are less surprising than more specific words, and that abstract words establish themselves empirically in semantically more diverse contexts than concrete words. Finally, our experiments once more point out that distributional models of conceptual categorisations need to take word classes and ambiguity into account: results for nouns vs. verbs differ in many respects, and ambiguity hinders fine-tuning empirical observations. Frontiers Media S.A. 2022-02-08 /pmc/articles/PMC8892386/ /pubmed/35252847 http://dx.doi.org/10.3389/frai.2021.796756 Text en Copyright © 2022 Schulte im Walde and Frassinelli. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Schulte im Walde, Sabine Frassinelli, Diego Distributional Measures of Semantic Abstraction
title	Distributional Measures of Semantic Abstraction
title_full	Distributional Measures of Semantic Abstraction
title_fullStr	Distributional Measures of Semantic Abstraction
title_full_unstemmed	Distributional Measures of Semantic Abstraction
title_short	Distributional Measures of Semantic Abstraction
title_sort	distributional measures of semantic abstraction
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8892386/ https://www.ncbi.nlm.nih.gov/pubmed/35252847 http://dx.doi.org/10.3389/frai.2021.796756
work_keys_str_mv	AT schulteimwaldesabine distributionalmeasuresofsemanticabstraction AT frassinellidiego distributionalmeasuresofsemanticabstraction

Distributional Measures of Semantic Abstraction

Ejemplares similares