Cargando…

Blackfoot Words: a database of Blackfoot lexical forms

This paper describes the structure and creation of Blackfoot Words, a new relational database of lexical forms (inflected words, stems, and morphemes) in Blackfoot (Algonquian; ISO 639-3: bla). To date, we have digitized 63,493 individual lexical forms from 30 sources, representing all four major di...

Descripción completa

Detalles Bibliográficos
Autores principales: Weber, Natalie, Brown, Tyler, Celli, Joshua, Denham, McKenzie, Dykstra, Hailey, Hernandez-Merlin, Rodrigo, Hochstein, Evan, Hwang, Pinyu, Kidd, Nico, Kulmizev, Diana, Morrison, Hannah, Norris, Matty, Venkatraman, Lena
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10064632/
https://www.ncbi.nlm.nih.gov/pubmed/37360262
http://dx.doi.org/10.1007/s10579-022-09631-2
_version_ 1785017939853836288
author Weber, Natalie
Brown, Tyler
Celli, Joshua
Denham, McKenzie
Dykstra, Hailey
Hernandez-Merlin, Rodrigo
Hochstein, Evan
Hwang, Pinyu
Kidd, Nico
Kulmizev, Diana
Morrison, Hannah
Norris, Matty
Venkatraman, Lena
author_facet Weber, Natalie
Brown, Tyler
Celli, Joshua
Denham, McKenzie
Dykstra, Hailey
Hernandez-Merlin, Rodrigo
Hochstein, Evan
Hwang, Pinyu
Kidd, Nico
Kulmizev, Diana
Morrison, Hannah
Norris, Matty
Venkatraman, Lena
author_sort Weber, Natalie
collection PubMed
description This paper describes the structure and creation of Blackfoot Words, a new relational database of lexical forms (inflected words, stems, and morphemes) in Blackfoot (Algonquian; ISO 639-3: bla). To date, we have digitized 63,493 individual lexical forms from 30 sources, representing all four major dialects, and spanning the years 1743–2017. Version 1.1 of the database includes lexical forms from nine of these sources. This project has two aims. The first is to digitize and provide access to the lexical data in these sources, many of which are difficult to access and discover. The second is to organize the data so that connections can be made between instances of the “same” lexical form across all sources, despite variation across sources in the dialect recorded, orthographic conventions, and the depth of morpheme analysis. The database structure was developed in response to these aims. The database comprises five tables: Sources, Words, Stems, Morphemes, and Lemmas. The Sources table contains bibliographic information and commentary on the sources. The Words table contains inflected words in the source orthography. Each word is broken down into stems and morphemes which are entered into the Stems and Morphemes tables in the source orthography. The Lemmas table contains abstract versions of each stem or morpheme in a standardized orthography. Instances of the same stem or morpheme are linked to a common lemma. We expect that the database will support projects by the language community and other researchers.
format Online
Article
Text
id pubmed-10064632
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-100646322023-03-31 Blackfoot Words: a database of Blackfoot lexical forms Weber, Natalie Brown, Tyler Celli, Joshua Denham, McKenzie Dykstra, Hailey Hernandez-Merlin, Rodrigo Hochstein, Evan Hwang, Pinyu Kidd, Nico Kulmizev, Diana Morrison, Hannah Norris, Matty Venkatraman, Lena Lang Resour Eval Original Paper This paper describes the structure and creation of Blackfoot Words, a new relational database of lexical forms (inflected words, stems, and morphemes) in Blackfoot (Algonquian; ISO 639-3: bla). To date, we have digitized 63,493 individual lexical forms from 30 sources, representing all four major dialects, and spanning the years 1743–2017. Version 1.1 of the database includes lexical forms from nine of these sources. This project has two aims. The first is to digitize and provide access to the lexical data in these sources, many of which are difficult to access and discover. The second is to organize the data so that connections can be made between instances of the “same” lexical form across all sources, despite variation across sources in the dialect recorded, orthographic conventions, and the depth of morpheme analysis. The database structure was developed in response to these aims. The database comprises five tables: Sources, Words, Stems, Morphemes, and Lemmas. The Sources table contains bibliographic information and commentary on the sources. The Words table contains inflected words in the source orthography. Each word is broken down into stems and morphemes which are entered into the Stems and Morphemes tables in the source orthography. The Lemmas table contains abstract versions of each stem or morpheme in a standardized orthography. Instances of the same stem or morpheme are linked to a common lemma. We expect that the database will support projects by the language community and other researchers. Springer Netherlands 2023-03-31 /pmc/articles/PMC10064632/ /pubmed/37360262 http://dx.doi.org/10.1007/s10579-022-09631-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Original Paper
Weber, Natalie
Brown, Tyler
Celli, Joshua
Denham, McKenzie
Dykstra, Hailey
Hernandez-Merlin, Rodrigo
Hochstein, Evan
Hwang, Pinyu
Kidd, Nico
Kulmizev, Diana
Morrison, Hannah
Norris, Matty
Venkatraman, Lena
Blackfoot Words: a database of Blackfoot lexical forms
title Blackfoot Words: a database of Blackfoot lexical forms
title_full Blackfoot Words: a database of Blackfoot lexical forms
title_fullStr Blackfoot Words: a database of Blackfoot lexical forms
title_full_unstemmed Blackfoot Words: a database of Blackfoot lexical forms
title_short Blackfoot Words: a database of Blackfoot lexical forms
title_sort blackfoot words: a database of blackfoot lexical forms
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10064632/
https://www.ncbi.nlm.nih.gov/pubmed/37360262
http://dx.doi.org/10.1007/s10579-022-09631-2
work_keys_str_mv AT webernatalie blackfootwordsadatabaseofblackfootlexicalforms
AT browntyler blackfootwordsadatabaseofblackfootlexicalforms
AT cellijoshua blackfootwordsadatabaseofblackfootlexicalforms
AT denhammckenzie blackfootwordsadatabaseofblackfootlexicalforms
AT dykstrahailey blackfootwordsadatabaseofblackfootlexicalforms
AT hernandezmerlinrodrigo blackfootwordsadatabaseofblackfootlexicalforms
AT hochsteinevan blackfootwordsadatabaseofblackfootlexicalforms
AT hwangpinyu blackfootwordsadatabaseofblackfootlexicalforms
AT kiddnico blackfootwordsadatabaseofblackfootlexicalforms
AT kulmizevdiana blackfootwordsadatabaseofblackfootlexicalforms
AT morrisonhannah blackfootwordsadatabaseofblackfootlexicalforms
AT norrismatty blackfootwordsadatabaseofblackfootlexicalforms
AT venkatramanlena blackfootwordsadatabaseofblackfootlexicalforms