Cargando…

Menzerath’s Law in the Syntax of Languages Compared with Random Sentences

The Menzerath law is considered to show an aspect of the complexity underlying natural language. This law suggests that, for a linguistic unit, the size (y) of a linguistic construct decreases as the number (x) of constructs in the unit increases. This article investigates this property syntacticall...

Descripción completa

Detalles Bibliográficos
Autor principal:	Tanaka-Ishii, Kumiko
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8230130/ https://www.ncbi.nlm.nih.gov/pubmed/34070370 http://dx.doi.org/10.3390/e23060661

_version_	1783713135096496128
author	Tanaka-Ishii, Kumiko
author_facet	Tanaka-Ishii, Kumiko
author_sort	Tanaka-Ishii, Kumiko
collection	PubMed
description	The Menzerath law is considered to show an aspect of the complexity underlying natural language. This law suggests that, for a linguistic unit, the size (y) of a linguistic construct decreases as the number (x) of constructs in the unit increases. This article investigates this property syntactically, with x as the number of constituents modifying the main predicate of a sentence and y as the size of those constituents in terms of the number of words. Following previous articles that demonstrated that the Menzerath property held for dependency corpora, such as in Czech and Ukrainian, this article first examines how well the property applies across languages by using the entire Universal Dependency dataset ver. 2.3, including 76 languages over 129 corpora and the Penn Treebank (PTB). The results show that the law holds reasonably well for [Formula: see text]. Then, for comparison, the property is investigated with syntactically randomized sentences generated from the PTB. These results show that the property is almost reproducible even from simple random data. Further analysis of the property highlights more detailed characteristics of natural language.
format	Online Article Text
id	pubmed-8230130
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-82301302021-06-26 Menzerath’s Law in the Syntax of Languages Compared with Random Sentences Tanaka-Ishii, Kumiko Entropy (Basel) Article The Menzerath law is considered to show an aspect of the complexity underlying natural language. This law suggests that, for a linguistic unit, the size (y) of a linguistic construct decreases as the number (x) of constructs in the unit increases. This article investigates this property syntactically, with x as the number of constituents modifying the main predicate of a sentence and y as the size of those constituents in terms of the number of words. Following previous articles that demonstrated that the Menzerath property held for dependency corpora, such as in Czech and Ukrainian, this article first examines how well the property applies across languages by using the entire Universal Dependency dataset ver. 2.3, including 76 languages over 129 corpora and the Penn Treebank (PTB). The results show that the law holds reasonably well for [Formula: see text]. Then, for comparison, the property is investigated with syntactically randomized sentences generated from the PTB. These results show that the property is almost reproducible even from simple random data. Further analysis of the property highlights more detailed characteristics of natural language. MDPI 2021-05-25 /pmc/articles/PMC8230130/ /pubmed/34070370 http://dx.doi.org/10.3390/e23060661 Text en © 2021 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Tanaka-Ishii, Kumiko Menzerath’s Law in the Syntax of Languages Compared with Random Sentences
title	Menzerath’s Law in the Syntax of Languages Compared with Random Sentences
title_full	Menzerath’s Law in the Syntax of Languages Compared with Random Sentences
title_fullStr	Menzerath’s Law in the Syntax of Languages Compared with Random Sentences
title_full_unstemmed	Menzerath’s Law in the Syntax of Languages Compared with Random Sentences
title_short	Menzerath’s Law in the Syntax of Languages Compared with Random Sentences
title_sort	menzerath’s law in the syntax of languages compared with random sentences
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8230130/ https://www.ncbi.nlm.nih.gov/pubmed/34070370 http://dx.doi.org/10.3390/e23060661
work_keys_str_mv	AT tanakaishiikumiko menzerathslawinthesyntaxoflanguagescomparedwithrandomsentences

Menzerath’s Law in the Syntax of Languages Compared with Random Sentences

Ejemplares similares