Cargando…
Blackfoot Words: a database of Blackfoot lexical forms
This paper describes the structure and creation of Blackfoot Words, a new relational database of lexical forms (inflected words, stems, and morphemes) in Blackfoot (Algonquian; ISO 639-3: bla). To date, we have digitized 63,493 individual lexical forms from 30 sources, representing all four major di...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Netherlands
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10064632/ https://www.ncbi.nlm.nih.gov/pubmed/37360262 http://dx.doi.org/10.1007/s10579-022-09631-2 |
_version_ | 1785017939853836288 |
---|---|
author | Weber, Natalie Brown, Tyler Celli, Joshua Denham, McKenzie Dykstra, Hailey Hernandez-Merlin, Rodrigo Hochstein, Evan Hwang, Pinyu Kidd, Nico Kulmizev, Diana Morrison, Hannah Norris, Matty Venkatraman, Lena |
author_facet | Weber, Natalie Brown, Tyler Celli, Joshua Denham, McKenzie Dykstra, Hailey Hernandez-Merlin, Rodrigo Hochstein, Evan Hwang, Pinyu Kidd, Nico Kulmizev, Diana Morrison, Hannah Norris, Matty Venkatraman, Lena |
author_sort | Weber, Natalie |
collection | PubMed |
description | This paper describes the structure and creation of Blackfoot Words, a new relational database of lexical forms (inflected words, stems, and morphemes) in Blackfoot (Algonquian; ISO 639-3: bla). To date, we have digitized 63,493 individual lexical forms from 30 sources, representing all four major dialects, and spanning the years 1743–2017. Version 1.1 of the database includes lexical forms from nine of these sources. This project has two aims. The first is to digitize and provide access to the lexical data in these sources, many of which are difficult to access and discover. The second is to organize the data so that connections can be made between instances of the “same” lexical form across all sources, despite variation across sources in the dialect recorded, orthographic conventions, and the depth of morpheme analysis. The database structure was developed in response to these aims. The database comprises five tables: Sources, Words, Stems, Morphemes, and Lemmas. The Sources table contains bibliographic information and commentary on the sources. The Words table contains inflected words in the source orthography. Each word is broken down into stems and morphemes which are entered into the Stems and Morphemes tables in the source orthography. The Lemmas table contains abstract versions of each stem or morpheme in a standardized orthography. Instances of the same stem or morpheme are linked to a common lemma. We expect that the database will support projects by the language community and other researchers. |
format | Online Article Text |
id | pubmed-10064632 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer Netherlands |
record_format | MEDLINE/PubMed |
spelling | pubmed-100646322023-03-31 Blackfoot Words: a database of Blackfoot lexical forms Weber, Natalie Brown, Tyler Celli, Joshua Denham, McKenzie Dykstra, Hailey Hernandez-Merlin, Rodrigo Hochstein, Evan Hwang, Pinyu Kidd, Nico Kulmizev, Diana Morrison, Hannah Norris, Matty Venkatraman, Lena Lang Resour Eval Original Paper This paper describes the structure and creation of Blackfoot Words, a new relational database of lexical forms (inflected words, stems, and morphemes) in Blackfoot (Algonquian; ISO 639-3: bla). To date, we have digitized 63,493 individual lexical forms from 30 sources, representing all four major dialects, and spanning the years 1743–2017. Version 1.1 of the database includes lexical forms from nine of these sources. This project has two aims. The first is to digitize and provide access to the lexical data in these sources, many of which are difficult to access and discover. The second is to organize the data so that connections can be made between instances of the “same” lexical form across all sources, despite variation across sources in the dialect recorded, orthographic conventions, and the depth of morpheme analysis. The database structure was developed in response to these aims. The database comprises five tables: Sources, Words, Stems, Morphemes, and Lemmas. The Sources table contains bibliographic information and commentary on the sources. The Words table contains inflected words in the source orthography. Each word is broken down into stems and morphemes which are entered into the Stems and Morphemes tables in the source orthography. The Lemmas table contains abstract versions of each stem or morpheme in a standardized orthography. Instances of the same stem or morpheme are linked to a common lemma. We expect that the database will support projects by the language community and other researchers. Springer Netherlands 2023-03-31 /pmc/articles/PMC10064632/ /pubmed/37360262 http://dx.doi.org/10.1007/s10579-022-09631-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Original Paper Weber, Natalie Brown, Tyler Celli, Joshua Denham, McKenzie Dykstra, Hailey Hernandez-Merlin, Rodrigo Hochstein, Evan Hwang, Pinyu Kidd, Nico Kulmizev, Diana Morrison, Hannah Norris, Matty Venkatraman, Lena Blackfoot Words: a database of Blackfoot lexical forms |
title | Blackfoot Words: a database of Blackfoot lexical forms |
title_full | Blackfoot Words: a database of Blackfoot lexical forms |
title_fullStr | Blackfoot Words: a database of Blackfoot lexical forms |
title_full_unstemmed | Blackfoot Words: a database of Blackfoot lexical forms |
title_short | Blackfoot Words: a database of Blackfoot lexical forms |
title_sort | blackfoot words: a database of blackfoot lexical forms |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10064632/ https://www.ncbi.nlm.nih.gov/pubmed/37360262 http://dx.doi.org/10.1007/s10579-022-09631-2 |
work_keys_str_mv | AT webernatalie blackfootwordsadatabaseofblackfootlexicalforms AT browntyler blackfootwordsadatabaseofblackfootlexicalforms AT cellijoshua blackfootwordsadatabaseofblackfootlexicalforms AT denhammckenzie blackfootwordsadatabaseofblackfootlexicalforms AT dykstrahailey blackfootwordsadatabaseofblackfootlexicalforms AT hernandezmerlinrodrigo blackfootwordsadatabaseofblackfootlexicalforms AT hochsteinevan blackfootwordsadatabaseofblackfootlexicalforms AT hwangpinyu blackfootwordsadatabaseofblackfootlexicalforms AT kiddnico blackfootwordsadatabaseofblackfootlexicalforms AT kulmizevdiana blackfootwordsadatabaseofblackfootlexicalforms AT morrisonhannah blackfootwordsadatabaseofblackfootlexicalforms AT norrismatty blackfootwordsadatabaseofblackfootlexicalforms AT venkatramanlena blackfootwordsadatabaseofblackfootlexicalforms |