Cargando…

subs2vec: Word embeddings from subtitles in 55 languages

This paper introduces a novel collection of word embeddings, numerical representations of lexical semantics, in 55 languages, trained on a large corpus of pseudo-conversational speech transcriptions from television shows and movies. The embeddings were trained on the OpenSubtitles corpus using the f...

Descripción completa

Detalles Bibliográficos
Autores principales: van Paridon, Jeroen, Thompson, Bill
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062394/
https://www.ncbi.nlm.nih.gov/pubmed/32789660
http://dx.doi.org/10.3758/s13428-020-01406-3