Cargando…

Productivity and Predictability for Measuring Morphological Complexity

We propose a quantitative approach for quantifying morphological complexity of a language based on text. Several corpus-based methods have focused on measuring the different word forms that a language can produce. We take into account not only the productivity of morphological processes but also the...

Descripción completa

Detalles Bibliográficos
Autores principales: Gutierrez-Vasques, Ximena, Mijangos, Victor
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516478/
https://www.ncbi.nlm.nih.gov/pubmed/33285823
http://dx.doi.org/10.3390/e22010048
_version_ 1783587010767749120
author Gutierrez-Vasques, Ximena
Mijangos, Victor
author_facet Gutierrez-Vasques, Ximena
Mijangos, Victor
author_sort Gutierrez-Vasques, Ximena
collection PubMed
description We propose a quantitative approach for quantifying morphological complexity of a language based on text. Several corpus-based methods have focused on measuring the different word forms that a language can produce. We take into account not only the productivity of morphological processes but also the predictability of those morphological processes. We use a language model that predicts the probability of sub-word sequences within a word; we calculate the entropy rate of this model and use it as a measure of predictability of the internal structure of words. Our results show that it is important to integrate these two dimensions when measuring morphological complexity, since languages can be complex under one measure but simpler under another one. We calculated the complexity measures in two different parallel corpora for a typologically diverse set of languages. Our approach is corpus-based and it does not require the use of linguistic annotated data.
format Online
Article
Text
id pubmed-7516478
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75164782020-11-09 Productivity and Predictability for Measuring Morphological Complexity Gutierrez-Vasques, Ximena Mijangos, Victor Entropy (Basel) Article We propose a quantitative approach for quantifying morphological complexity of a language based on text. Several corpus-based methods have focused on measuring the different word forms that a language can produce. We take into account not only the productivity of morphological processes but also the predictability of those morphological processes. We use a language model that predicts the probability of sub-word sequences within a word; we calculate the entropy rate of this model and use it as a measure of predictability of the internal structure of words. Our results show that it is important to integrate these two dimensions when measuring morphological complexity, since languages can be complex under one measure but simpler under another one. We calculated the complexity measures in two different parallel corpora for a typologically diverse set of languages. Our approach is corpus-based and it does not require the use of linguistic annotated data. MDPI 2019-12-30 /pmc/articles/PMC7516478/ /pubmed/33285823 http://dx.doi.org/10.3390/e22010048 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Gutierrez-Vasques, Ximena
Mijangos, Victor
Productivity and Predictability for Measuring Morphological Complexity
title Productivity and Predictability for Measuring Morphological Complexity
title_full Productivity and Predictability for Measuring Morphological Complexity
title_fullStr Productivity and Predictability for Measuring Morphological Complexity
title_full_unstemmed Productivity and Predictability for Measuring Morphological Complexity
title_short Productivity and Predictability for Measuring Morphological Complexity
title_sort productivity and predictability for measuring morphological complexity
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516478/
https://www.ncbi.nlm.nih.gov/pubmed/33285823
http://dx.doi.org/10.3390/e22010048
work_keys_str_mv AT gutierrezvasquesximena productivityandpredictabilityformeasuringmorphologicalcomplexity
AT mijangosvictor productivityandpredictabilityformeasuringmorphologicalcomplexity