Cargando…

Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics

The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards f...

Descripción completa

Detalles Bibliográficos
Autores principales: Forkel, Robert, List, Johann-Mattis, Greenhill, Simon J., Rzymski, Christoph, Bank, Sebastian, Cysouw, Michael, Hammarström, Harald, Haspelmath, Martin, Kaiping, Gereon A., Gray, Russell D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6190742/
https://www.ncbi.nlm.nih.gov/pubmed/30325347
http://dx.doi.org/10.1038/sdata.2018.205
_version_ 1783363619235299328
author Forkel, Robert
List, Johann-Mattis
Greenhill, Simon J.
Rzymski, Christoph
Bank, Sebastian
Cysouw, Michael
Hammarström, Harald
Haspelmath, Martin
Kaiping, Gereon A.
Gray, Russell D.
author_facet Forkel, Robert
List, Johann-Mattis
Greenhill, Simon J.
Rzymski, Christoph
Bank, Sebastian
Cysouw, Michael
Hammarström, Harald
Haspelmath, Martin
Kaiping, Gereon A.
Gray, Russell D.
author_sort Forkel, Robert
collection PubMed
description The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for validation and manipulation, a basic ontology which links to more general frameworks, and usage examples of best practices.
format Online
Article
Text
id pubmed-6190742
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-61907422018-10-29 Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics Forkel, Robert List, Johann-Mattis Greenhill, Simon J. Rzymski, Christoph Bank, Sebastian Cysouw, Michael Hammarström, Harald Haspelmath, Martin Kaiping, Gereon A. Gray, Russell D. Sci Data Article The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for validation and manipulation, a basic ontology which links to more general frameworks, and usage examples of best practices. Nature Publishing Group 2018-10-16 /pmc/articles/PMC6190742/ /pubmed/30325347 http://dx.doi.org/10.1038/sdata.2018.205 Text en Copyright © 2018, The Author(s) http://creativecommons.org/licenses/by/4.0/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Forkel, Robert
List, Johann-Mattis
Greenhill, Simon J.
Rzymski, Christoph
Bank, Sebastian
Cysouw, Michael
Hammarström, Harald
Haspelmath, Martin
Kaiping, Gereon A.
Gray, Russell D.
Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
title Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
title_full Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
title_fullStr Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
title_full_unstemmed Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
title_short Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
title_sort cross-linguistic data formats, advancing data sharing and re-use in comparative linguistics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6190742/
https://www.ncbi.nlm.nih.gov/pubmed/30325347
http://dx.doi.org/10.1038/sdata.2018.205
work_keys_str_mv AT forkelrobert crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT listjohannmattis crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT greenhillsimonj crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT rzymskichristoph crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT banksebastian crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT cysouwmichael crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT hammarstromharald crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT haspelmathmartin crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT kaipinggereona crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics
AT grayrusselld crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics