Cargando…
A large dataset of semantic ratings and its computational extension
Evidence from psychology and cognitive neuroscience indicates that the human brain’s semantic system contains several specific subsystems, each representing a particular dimension of semantic information. Word ratings on these different semantic dimensions can help investigate the behavioral and neu...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9950052/ https://www.ncbi.nlm.nih.gov/pubmed/36823158 http://dx.doi.org/10.1038/s41597-023-01995-6 |
_version_ | 1784893078636593152 |
---|---|
author | Wang, Shaonan Zhang, Yunhao Shi, Weiting Zhang, Guangyao Zhang, Jiajun Lin, Nan Zong, Chengqing |
author_facet | Wang, Shaonan Zhang, Yunhao Shi, Weiting Zhang, Guangyao Zhang, Jiajun Lin, Nan Zong, Chengqing |
author_sort | Wang, Shaonan |
collection | PubMed |
description | Evidence from psychology and cognitive neuroscience indicates that the human brain’s semantic system contains several specific subsystems, each representing a particular dimension of semantic information. Word ratings on these different semantic dimensions can help investigate the behavioral and neural impacts of semantic dimensions on language processes and build computational representations of language meaning according to the semantic space of the human cognitive system. Existing semantic rating databases provide ratings for hundreds to thousands of words, which can hardly support a comprehensive semantic analysis of natural texts or speech. This article reports a large database, the Six Semantic Dimension Database (SSDD), which contains subjective ratings for 17,940 commonly used Chinese words on six major semantic dimensions: vision, motor, socialness, emotion, time, and space. Furthermore, using computational models to learn the mapping relations between subjective ratings and word embeddings, we include the estimated semantic ratings for 1,427,992 Chinese and 1,515,633 English words in the SSDD. The SSDD will aid studies on natural language processing, text analysis, and semantic representation in the brain. |
format | Online Article Text |
id | pubmed-9950052 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-99500522023-02-25 A large dataset of semantic ratings and its computational extension Wang, Shaonan Zhang, Yunhao Shi, Weiting Zhang, Guangyao Zhang, Jiajun Lin, Nan Zong, Chengqing Sci Data Data Descriptor Evidence from psychology and cognitive neuroscience indicates that the human brain’s semantic system contains several specific subsystems, each representing a particular dimension of semantic information. Word ratings on these different semantic dimensions can help investigate the behavioral and neural impacts of semantic dimensions on language processes and build computational representations of language meaning according to the semantic space of the human cognitive system. Existing semantic rating databases provide ratings for hundreds to thousands of words, which can hardly support a comprehensive semantic analysis of natural texts or speech. This article reports a large database, the Six Semantic Dimension Database (SSDD), which contains subjective ratings for 17,940 commonly used Chinese words on six major semantic dimensions: vision, motor, socialness, emotion, time, and space. Furthermore, using computational models to learn the mapping relations between subjective ratings and word embeddings, we include the estimated semantic ratings for 1,427,992 Chinese and 1,515,633 English words in the SSDD. The SSDD will aid studies on natural language processing, text analysis, and semantic representation in the brain. Nature Publishing Group UK 2023-02-23 /pmc/articles/PMC9950052/ /pubmed/36823158 http://dx.doi.org/10.1038/s41597-023-01995-6 Text en © The Author(s) 2023, corrected publication 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Data Descriptor Wang, Shaonan Zhang, Yunhao Shi, Weiting Zhang, Guangyao Zhang, Jiajun Lin, Nan Zong, Chengqing A large dataset of semantic ratings and its computational extension |
title | A large dataset of semantic ratings and its computational extension |
title_full | A large dataset of semantic ratings and its computational extension |
title_fullStr | A large dataset of semantic ratings and its computational extension |
title_full_unstemmed | A large dataset of semantic ratings and its computational extension |
title_short | A large dataset of semantic ratings and its computational extension |
title_sort | large dataset of semantic ratings and its computational extension |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9950052/ https://www.ncbi.nlm.nih.gov/pubmed/36823158 http://dx.doi.org/10.1038/s41597-023-01995-6 |
work_keys_str_mv | AT wangshaonan alargedatasetofsemanticratingsanditscomputationalextension AT zhangyunhao alargedatasetofsemanticratingsanditscomputationalextension AT shiweiting alargedatasetofsemanticratingsanditscomputationalextension AT zhangguangyao alargedatasetofsemanticratingsanditscomputationalextension AT zhangjiajun alargedatasetofsemanticratingsanditscomputationalextension AT linnan alargedatasetofsemanticratingsanditscomputationalextension AT zongchengqing alargedatasetofsemanticratingsanditscomputationalextension AT wangshaonan largedatasetofsemanticratingsanditscomputationalextension AT zhangyunhao largedatasetofsemanticratingsanditscomputationalextension AT shiweiting largedatasetofsemanticratingsanditscomputationalextension AT zhangguangyao largedatasetofsemanticratingsanditscomputationalextension AT zhangjiajun largedatasetofsemanticratingsanditscomputationalextension AT linnan largedatasetofsemanticratingsanditscomputationalextension AT zongchengqing largedatasetofsemanticratingsanditscomputationalextension |