Cargando…
Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards f...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6190742/ https://www.ncbi.nlm.nih.gov/pubmed/30325347 http://dx.doi.org/10.1038/sdata.2018.205 |
_version_ | 1783363619235299328 |
---|---|
author | Forkel, Robert List, Johann-Mattis Greenhill, Simon J. Rzymski, Christoph Bank, Sebastian Cysouw, Michael Hammarström, Harald Haspelmath, Martin Kaiping, Gereon A. Gray, Russell D. |
author_facet | Forkel, Robert List, Johann-Mattis Greenhill, Simon J. Rzymski, Christoph Bank, Sebastian Cysouw, Michael Hammarström, Harald Haspelmath, Martin Kaiping, Gereon A. Gray, Russell D. |
author_sort | Forkel, Robert |
collection | PubMed |
description | The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for validation and manipulation, a basic ontology which links to more general frameworks, and usage examples of best practices. |
format | Online Article Text |
id | pubmed-6190742 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-61907422018-10-29 Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics Forkel, Robert List, Johann-Mattis Greenhill, Simon J. Rzymski, Christoph Bank, Sebastian Cysouw, Michael Hammarström, Harald Haspelmath, Martin Kaiping, Gereon A. Gray, Russell D. Sci Data Article The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for validation and manipulation, a basic ontology which links to more general frameworks, and usage examples of best practices. Nature Publishing Group 2018-10-16 /pmc/articles/PMC6190742/ /pubmed/30325347 http://dx.doi.org/10.1038/sdata.2018.205 Text en Copyright © 2018, The Author(s) http://creativecommons.org/licenses/by/4.0/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Forkel, Robert List, Johann-Mattis Greenhill, Simon J. Rzymski, Christoph Bank, Sebastian Cysouw, Michael Hammarström, Harald Haspelmath, Martin Kaiping, Gereon A. Gray, Russell D. Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics |
title | Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics |
title_full | Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics |
title_fullStr | Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics |
title_full_unstemmed | Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics |
title_short | Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics |
title_sort | cross-linguistic data formats, advancing data sharing and re-use in comparative linguistics |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6190742/ https://www.ncbi.nlm.nih.gov/pubmed/30325347 http://dx.doi.org/10.1038/sdata.2018.205 |
work_keys_str_mv | AT forkelrobert crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT listjohannmattis crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT greenhillsimonj crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT rzymskichristoph crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT banksebastian crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT cysouwmichael crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT hammarstromharald crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT haspelmathmartin crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT kaipinggereona crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics AT grayrusselld crosslinguisticdataformatsadvancingdatasharingandreuseincomparativelinguistics |