Cargando…

The construction of Chinese microblog gender-specific thesauruses and user gender classification

Based on the statistical features, short text messages published by different gender users are different in terms of the words and semantics used. In this paper, two new features are constructed after constructing a gender-specific thesaurus. A new classification model is constructed by combining th...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Zhiliang, Ke, Zejun, Cui, Jiayin, Yu, Hai, Liu, Guoqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6223889/
https://www.ncbi.nlm.nih.gov/pubmed/30465023
http://dx.doi.org/10.1007/s41109-018-0104-1
Descripción
Sumario:Based on the statistical features, short text messages published by different gender users are different in terms of the words and semantics used. In this paper, two new features are constructed after constructing a gender-specific thesaurus. A new classification model is constructed by combining the traditional statistical features and the improved text implicitness feature. The experimental evaluation performed on the Sina Weibo dataset demonstrated the effectiveness of gender-specific thesaurus-based features, and the improved text implicitness feature improved the accuracy of gender classification to 84.7%.