Cargando…
Using a high-dimensional graph of semantic space to model relationships among words
The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4026710/ https://www.ncbi.nlm.nih.gov/pubmed/24860525 http://dx.doi.org/10.3389/fpsyg.2014.00385 |
_version_ | 1782316885826600960 |
---|---|
author | Jackson, Alice F. Bolger, Donald J. |
author_facet | Jackson, Alice F. Bolger, Donald J. |
author_sort | Jackson, Alice F. |
collection | PubMed |
description | The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD). |
format | Online Article Text |
id | pubmed-4026710 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-40267102014-05-23 Using a high-dimensional graph of semantic space to model relationships among words Jackson, Alice F. Bolger, Donald J. Front Psychol Psychology The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD). Frontiers Media S.A. 2014-05-12 /pmc/articles/PMC4026710/ /pubmed/24860525 http://dx.doi.org/10.3389/fpsyg.2014.00385 Text en Copyright © 2014 Jackson and Bolger. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Psychology Jackson, Alice F. Bolger, Donald J. Using a high-dimensional graph of semantic space to model relationships among words |
title | Using a high-dimensional graph of semantic space to model relationships among words |
title_full | Using a high-dimensional graph of semantic space to model relationships among words |
title_fullStr | Using a high-dimensional graph of semantic space to model relationships among words |
title_full_unstemmed | Using a high-dimensional graph of semantic space to model relationships among words |
title_short | Using a high-dimensional graph of semantic space to model relationships among words |
title_sort | using a high-dimensional graph of semantic space to model relationships among words |
topic | Psychology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4026710/ https://www.ncbi.nlm.nih.gov/pubmed/24860525 http://dx.doi.org/10.3389/fpsyg.2014.00385 |
work_keys_str_mv | AT jacksonalicef usingahighdimensionalgraphofsemanticspacetomodelrelationshipsamongwords AT bolgerdonaldj usingahighdimensionalgraphofsemanticspacetomodelrelationshipsamongwords |