Cargando…

Unsupervised acquisition of idiomatic units of symbolic natural language: An n-gram frequency-based approach for the chunking of news articles and tweets

Symbolic sequential data are produced in huge quantities in numerous contexts, such as text and speech data, biometrics, genomics, financial market indexes, music sheets, and online social media posts. In this paper, an unsupervised approach for the chunking of idiomatic units of sequential text dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Borrelli, Dario, Gongora Svartzman, Gabriela, Lipizzi, Carlo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7279599/
https://www.ncbi.nlm.nih.gov/pubmed/32511252
http://dx.doi.org/10.1371/journal.pone.0234214