Cargando…
An Empirical Model for n-gram Frequency Distribution in Large Corpora
Statistical multiword extraction methods can benefit from the knowledge on the n-gram ([Formula: see text]) frequency distribution in natural language corpora, for indexing and time/space optimization purposes. The appearance of increasingly large corpora raises new challenges on the investigation o...
Autores principales: | Silva, Joaquim F., Cunha, Jose C. |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206297/ http://dx.doi.org/10.1007/978-3-030-47436-2_63 |
Ejemplares similares
-
Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora
por: Collective
Publicado: (1999) -
Metaphor Identification in Large Texts Corpora
por: Neuman, Yair, et al.
Publicado: (2013) -
Workshop on Very Large Corpora : Academic and Industrial Perspectives
Publicado: (1993) -
Frequency, Informativity and Word Length: Insights from Typologically Diverse Corpora
por: Levshina, Natalia
Publicado: (2022) -
Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora
por: Al-Thubaity, Abdulmohsen, et al.
Publicado: (2014)