Cargando…

An Empirical Model for n-gram Frequency Distribution in Large Corpora

Statistical multiword extraction methods can benefit from the knowledge on the n-gram ([Formula: see text]) frequency distribution in natural language corpora, for indexing and time/space optimization purposes. The appearance of increasingly large corpora raises new challenges on the investigation o...

Descripción completa

Detalles Bibliográficos
Autores principales: Silva, Joaquim F., Cunha, Jose C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206297/
http://dx.doi.org/10.1007/978-3-030-47436-2_63