Cargando…

subs2vec: Word embeddings from subtitles in 55 languages

This paper introduces a novel collection of word embeddings, numerical representations of lexical semantics, in 55 languages, trained on a large corpus of pseudo-conversational speech transcriptions from television shows and movies. The embeddings were trained on the OpenSubtitles corpus using the f...

Descripción completa

Detalles Bibliográficos
Autores principales:	van Paridon, Jeroen, Thompson, Bill
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062394/ https://www.ncbi.nlm.nih.gov/pubmed/32789660 http://dx.doi.org/10.3758/s13428-020-01406-3

Internet

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062394/
https://www.ncbi.nlm.nih.gov/pubmed/32789660
http://dx.doi.org/10.3758/s13428-020-01406-3

subs2vec: Word embeddings from subtitles in 55 languages

Internet

Ejemplares similares