Cargando…

The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies

Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven questions can now be investigated. Such advances, however, are br...

Descripción completa

Detalles Bibliográficos
Autores principales: Rzymski, Christoph, Tresoldi, Tiago, Greenhill, Simon J., Wu, Mei-Shin, Schweikhard, Nathanael E., Koptjevskaja-Tamm, Maria, Gast, Volker, Bodt, Timotheus A., Hantgan, Abbie, Kaiping, Gereon A., Chang, Sophie, Lai, Yunfan, Morozova, Natalia, Arjava, Heini, Hübler, Nataliia, Koile, Ezequiel, Pepper, Steve, Proos, Mariann, Van Epps, Briana, Blanco, Ingrid, Hundt, Carolin, Monakhov, Sergei, Pianykh, Kristina, Ramesh, Sallona, Gray, Russell D., Forkel, Robert, List, Johann-Mattis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6957499/
https://www.ncbi.nlm.nih.gov/pubmed/31932593
http://dx.doi.org/10.1038/s41597-019-0341-x
_version_ 1783487316456636416
author Rzymski, Christoph
Tresoldi, Tiago
Greenhill, Simon J.
Wu, Mei-Shin
Schweikhard, Nathanael E.
Koptjevskaja-Tamm, Maria
Gast, Volker
Bodt, Timotheus A.
Hantgan, Abbie
Kaiping, Gereon A.
Chang, Sophie
Lai, Yunfan
Morozova, Natalia
Arjava, Heini
Hübler, Nataliia
Koile, Ezequiel
Pepper, Steve
Proos, Mariann
Van Epps, Briana
Blanco, Ingrid
Hundt, Carolin
Monakhov, Sergei
Pianykh, Kristina
Ramesh, Sallona
Gray, Russell D.
Forkel, Robert
List, Johann-Mattis
author_facet Rzymski, Christoph
Tresoldi, Tiago
Greenhill, Simon J.
Wu, Mei-Shin
Schweikhard, Nathanael E.
Koptjevskaja-Tamm, Maria
Gast, Volker
Bodt, Timotheus A.
Hantgan, Abbie
Kaiping, Gereon A.
Chang, Sophie
Lai, Yunfan
Morozova, Natalia
Arjava, Heini
Hübler, Nataliia
Koile, Ezequiel
Pepper, Steve
Proos, Mariann
Van Epps, Briana
Blanco, Ingrid
Hundt, Carolin
Monakhov, Sergei
Pianykh, Kristina
Ramesh, Sallona
Gray, Russell D.
Forkel, Robert
List, Johann-Mattis
author_sort Rzymski, Christoph
collection PubMed
description Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven questions can now be investigated. Such advances, however, are bringing high requirements in terms of rigorousness for preparing and curating datasets. Here we present CLICS, a Database of Cross-Linguistic Colexifications (CLICS). CLICS tackles interconnected interdisciplinary research questions about the colexification of words across semantic categories in the world’s languages, and show-cases best practices for preparing data for cross-linguistic research. This is done by addressing shortcomings of an earlier version of the database, CLICS2, and by supplying an updated version with CLICS3, which massively increases the size and scope of the project. We provide tools and guidelines for this purpose and discuss insights resulting from organizing student tasks for database updates.
format Online
Article
Text
id pubmed-6957499
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-69574992020-01-22 The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies Rzymski, Christoph Tresoldi, Tiago Greenhill, Simon J. Wu, Mei-Shin Schweikhard, Nathanael E. Koptjevskaja-Tamm, Maria Gast, Volker Bodt, Timotheus A. Hantgan, Abbie Kaiping, Gereon A. Chang, Sophie Lai, Yunfan Morozova, Natalia Arjava, Heini Hübler, Nataliia Koile, Ezequiel Pepper, Steve Proos, Mariann Van Epps, Briana Blanco, Ingrid Hundt, Carolin Monakhov, Sergei Pianykh, Kristina Ramesh, Sallona Gray, Russell D. Forkel, Robert List, Johann-Mattis Sci Data Data Descriptor Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven questions can now be investigated. Such advances, however, are bringing high requirements in terms of rigorousness for preparing and curating datasets. Here we present CLICS, a Database of Cross-Linguistic Colexifications (CLICS). CLICS tackles interconnected interdisciplinary research questions about the colexification of words across semantic categories in the world’s languages, and show-cases best practices for preparing data for cross-linguistic research. This is done by addressing shortcomings of an earlier version of the database, CLICS2, and by supplying an updated version with CLICS3, which massively increases the size and scope of the project. We provide tools and guidelines for this purpose and discuss insights resulting from organizing student tasks for database updates. Nature Publishing Group UK 2020-01-13 /pmc/articles/PMC6957499/ /pubmed/31932593 http://dx.doi.org/10.1038/s41597-019-0341-x Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.
spellingShingle Data Descriptor
Rzymski, Christoph
Tresoldi, Tiago
Greenhill, Simon J.
Wu, Mei-Shin
Schweikhard, Nathanael E.
Koptjevskaja-Tamm, Maria
Gast, Volker
Bodt, Timotheus A.
Hantgan, Abbie
Kaiping, Gereon A.
Chang, Sophie
Lai, Yunfan
Morozova, Natalia
Arjava, Heini
Hübler, Nataliia
Koile, Ezequiel
Pepper, Steve
Proos, Mariann
Van Epps, Briana
Blanco, Ingrid
Hundt, Carolin
Monakhov, Sergei
Pianykh, Kristina
Ramesh, Sallona
Gray, Russell D.
Forkel, Robert
List, Johann-Mattis
The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies
title The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies
title_full The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies
title_fullStr The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies
title_full_unstemmed The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies
title_short The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies
title_sort database of cross-linguistic colexifications, reproducible analysis of cross-linguistic polysemies
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6957499/
https://www.ncbi.nlm.nih.gov/pubmed/31932593
http://dx.doi.org/10.1038/s41597-019-0341-x
work_keys_str_mv AT rzymskichristoph thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT tresolditiago thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT greenhillsimonj thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT wumeishin thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT schweikhardnathanaele thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT koptjevskajatammmaria thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT gastvolker thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT bodttimotheusa thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT hantganabbie thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT kaipinggereona thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT changsophie thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT laiyunfan thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT morozovanatalia thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT arjavaheini thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT hublernataliia thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT koileezequiel thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT peppersteve thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT proosmariann thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT vaneppsbriana thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT blancoingrid thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT hundtcarolin thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT monakhovsergei thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT pianykhkristina thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT rameshsallona thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT grayrusselld thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT forkelrobert thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT listjohannmattis thedatabaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT rzymskichristoph databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT tresolditiago databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT greenhillsimonj databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT wumeishin databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT schweikhardnathanaele databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT koptjevskajatammmaria databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT gastvolker databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT bodttimotheusa databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT hantganabbie databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT kaipinggereona databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT changsophie databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT laiyunfan databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT morozovanatalia databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT arjavaheini databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT hublernataliia databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT koileezequiel databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT peppersteve databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT proosmariann databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT vaneppsbriana databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT blancoingrid databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT hundtcarolin databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT monakhovsergei databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT pianykhkristina databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT rameshsallona databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT grayrusselld databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT forkelrobert databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies
AT listjohannmattis databaseofcrosslinguisticcolexificationsreproducibleanalysisofcrosslinguisticpolysemies