Cargando…

Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic

Despite remarkable progress in digital linguistics, extensive databases of geographical language distributions are missing. This hampers both studies on language spatiality and public outreach of language diversity. We present best practices for creating and sharing digital spatial language data by...

Descripción completa

Detalles Bibliográficos
Autores principales: Rantanen, Timo, Tolvanen, Harri, Roose, Meeli, Ylikoski, Jussi, Vesakoski, Outi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9176854/
https://www.ncbi.nlm.nih.gov/pubmed/35675367
http://dx.doi.org/10.1371/journal.pone.0269648
_version_ 1784722761954885632
author Rantanen, Timo
Tolvanen, Harri
Roose, Meeli
Ylikoski, Jussi
Vesakoski, Outi
author_facet Rantanen, Timo
Tolvanen, Harri
Roose, Meeli
Ylikoski, Jussi
Vesakoski, Outi
author_sort Rantanen, Timo
collection PubMed
description Despite remarkable progress in digital linguistics, extensive databases of geographical language distributions are missing. This hampers both studies on language spatiality and public outreach of language diversity. We present best practices for creating and sharing digital spatial language data by collecting and harmonizing Uralic language distributions as case study. Language distribution studies have utilized various methodologies, and the results are often available as printed maps or written descriptions. In order to analyze language spatiality, the information must be digitized into geospatial data, which contains location, time and other parameters. When compiled and harmonized, this data can be used to study changes in languages’ distribution, and combined with, for example, population and environmental data. We also utilized the knowledge of language experts to adjust previous and new information of language distributions into state-of-the-art maps. The extensive database, including the distribution datasets and detailed map visualizations of the Uralic languages are introduced alongside this article, and they are freely available.
format Online
Article
Text
id pubmed-9176854
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-91768542022-06-09 Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic Rantanen, Timo Tolvanen, Harri Roose, Meeli Ylikoski, Jussi Vesakoski, Outi PLoS One Research Article Despite remarkable progress in digital linguistics, extensive databases of geographical language distributions are missing. This hampers both studies on language spatiality and public outreach of language diversity. We present best practices for creating and sharing digital spatial language data by collecting and harmonizing Uralic language distributions as case study. Language distribution studies have utilized various methodologies, and the results are often available as printed maps or written descriptions. In order to analyze language spatiality, the information must be digitized into geospatial data, which contains location, time and other parameters. When compiled and harmonized, this data can be used to study changes in languages’ distribution, and combined with, for example, population and environmental data. We also utilized the knowledge of language experts to adjust previous and new information of language distributions into state-of-the-art maps. The extensive database, including the distribution datasets and detailed map visualizations of the Uralic languages are introduced alongside this article, and they are freely available. Public Library of Science 2022-06-08 /pmc/articles/PMC9176854/ /pubmed/35675367 http://dx.doi.org/10.1371/journal.pone.0269648 Text en © 2022 Rantanen et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Rantanen, Timo
Tolvanen, Harri
Roose, Meeli
Ylikoski, Jussi
Vesakoski, Outi
Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic
title Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic
title_full Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic
title_fullStr Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic
title_full_unstemmed Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic
title_short Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic
title_sort best practices for spatial language data harmonization, sharing and map creation—a case study of uralic
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9176854/
https://www.ncbi.nlm.nih.gov/pubmed/35675367
http://dx.doi.org/10.1371/journal.pone.0269648
work_keys_str_mv AT rantanentimo bestpracticesforspatiallanguagedataharmonizationsharingandmapcreationacasestudyofuralic
AT tolvanenharri bestpracticesforspatiallanguagedataharmonizationsharingandmapcreationacasestudyofuralic
AT roosemeeli bestpracticesforspatiallanguagedataharmonizationsharingandmapcreationacasestudyofuralic
AT ylikoskijussi bestpracticesforspatiallanguagedataharmonizationsharingandmapcreationacasestudyofuralic
AT vesakoskiouti bestpracticesforspatiallanguagedataharmonizationsharingandmapcreationacasestudyofuralic