Cargando…

An Empirical Model for n-gram Frequency Distribution in Large Corpora

Statistical multiword extraction methods can benefit from the knowledge on the n-gram ([Formula: see text]) frequency distribution in natural language corpora, for indexing and time/space optimization purposes. The appearance of increasingly large corpora raises new challenges on the investigation o...

Descripción completa

Detalles Bibliográficos
Autores principales:	Silva, Joaquim F., Cunha, Jose C.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206297/ http://dx.doi.org/10.1007/978-3-030-47436-2_63

Ejemplares similares

Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora
por: Collective
Publicado: (1999)

Metaphor Identification in Large Texts Corpora
por: Neuman, Yair, et al.
Publicado: (2013)

Workshop on Very Large Corpora : Academic and Industrial Perspectives
Publicado: (1993)

Frequency, Informativity and Word Length: Insights from Typologically Diverse Corpora
por: Levshina, Natalia
Publicado: (2022)

Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora
por: Al-Thubaity, Abdulmohsen, et al.
Publicado: (2014)

Peptidomic Analysis of the Brain and Corpora Cardiaca-Corpora Allata Complex in the Bombyx mori
por: Liu, Xiaoguang, et al.
Publicado: (2012)

Corpora in applied linguistics /
por: Hunston, Susan, 1953-
Publicado: (2002)

Observations on Corpora Lutea
por: Paterson, Robert
Publicado: (1841)

6th ACL SIGDAT Workshop on Very Large Corpora
Publicado: (1998)

4th ACL SIGDAT Workshop on Very Large Corpora
Publicado: (1996)

Adapting Multiple Distributions for Bridging Emotions from Different Speech Corpora
por: Zong, Yuan, et al.
Publicado: (2022)

Using corpora in the language classroom /
por: Reppen, Randi
Publicado: (2010)

Treatments for fibrosis of the corpora cavernosa
por: Egydio, Paulo H., et al.
Publicado: (2013)

Large Corpora and Historical Syntax: Consequences for the Study of Morphosyntactic Diffusion in the History of Spanish
por: Octavio de Toledo y Huerta, Álvaro S.
Publicado: (2019)

Strategies for the Analysis of Large Social Media Corpora: Sampling and Keyword Extraction Methods
por: Moreno-Ortiz, Antonio, et al.
Publicado: (2023)

Persistent Priapism, from Thrombosis of the Corpora Cavernosa
por: Weber, F. Parkes
Publicado: (1898)

An analysis on the entity annotations in biological corpora
por: Neves, Mariana
Publicado: (2014)

Observations on Corpora Lutea: Part I
por: Paterson, Robert
Publicado: (1840)

Reflections on Gender Analyses of Bibliographic Corpora
por: Mihaljević, Helena, et al.
Publicado: (2019)

The ParlaMint corpora of parliamentary proceedings
por: Erjavec, Tomaž, et al.
Publicado: (2022)

Total corpora mobilization for penile reconstruction
por: Barroso, Ubirajara, et al.
Publicado: (2022)

Corpora Amylacea in Neurodegenerative Diseases: Cause or Effect?
por: Rohn, Troy T.
Publicado: (2015)

New perspectives on corpora amylacea in the human brain
por: Augé, Elisabet, et al.
Publicado: (2017)

The venous drainage of the corpora cavernosa in the human penis
por: Hsu, Geng-Long, et al.
Publicado: (2013)

Pooling annotated corpora for clinical concept extraction
por: Wagholikar, Kavishwar B, et al.
Publicado: (2013)

Traumatic Metachronous Penile Fracture to the Contralateral Corpora
por: Papaioannou, Christos, et al.
Publicado: (2022)

Editorial: Language, corpora, and technology in applied linguistics
por: Naqvi, Swaleha Bano, et al.
Publicado: (2023)

WARCProcessor: An Integrative Tool for Building and Management of Web Spam Corpora
por: Callón, Miguel, et al.
Publicado: (2017)

Current Knowledge on the Multifactorial Regulation of Corpora Lutea Lifespan: The Rabbit Model
por: Zerani, Massimo, et al.
Publicado: (2021)

Chronological corpora curve clustering: From scientific corpora construction to knowledge dynamics discovery through word life-cycles clustering
por: Trevisani, Matilde, et al.
Publicado: (2018)

Quantitative analysis of size and regional distribution of corpora amylacea in the hippocampal formation of obstructive sleep apnoea patients
por: Xu, Cuicui, et al.
Publicado: (2021)

Primary diffuse large B-cell lymphoma of the corpora cavernosa presented as a perineal mass
por: Carlos, González-Satué, et al.
Publicado: (2012)

Feasibility of pooling annotated corpora for clinical concept extraction
por: Wagholikar, Kavishwar, et al.
Publicado: (2012)

Exploring the elusive composition of corpora amylacea of human brain
por: Augé, Elisabet, et al.
Publicado: (2018)

A non-parametric significance test to compare corpora
por: Koplenig, Alexander
Publicado: (2019)

PyPlutchik: Visualising and comparing emotion-annotated corpora
por: Semeraro, Alfonso, et al.
Publicado: (2021)

Dynamics of extracellular matrix in ovarian follicles and corpora lutea of mice
por: Irving-Rodgers, Helen F., et al.
Publicado: (2009)

Corpora amylacea in human hippocampal brain tissue are intracellular bodies that exhibit a homogeneous distribution of neo-epitopes
por: Augé, Elisabet, et al.
Publicado: (2019)

Comparative analysis of five protein-protein interaction corpora
por: Pyysalo, Sampo, et al.
Publicado: (2008)

On the Colour and Structure Presented by Corpora Lutea in the Early Stage
por: Paterson, Robert
Publicado: (1844)

Cannot write session to /tmp/vufind_sessions/sess_d27i1u3v8d0is3a7jucgvfgull