Cargando…

A Comprehensive Study of Features and Algorithms for URL-Based Topic Classification

Given only the URL of a Web page, can we identify its topic? We study this problem in detail by exploring a large number of different feature sets and algorithms on several datasets. We also show that the inherent overlap between topics and the sparsity of the information in URLs makes this a very c...

Descripción completa

Detalles Bibliográficos
Autores principales: Weber, I, Marian, L, Henzinger, M, Baykan, E
Lenguaje:eng
Publicado: 2011
Materias:
XX
Acceso en línea:https://dx.doi.org/10.1145/1993053.1993057
http://cds.cern.ch/record/1399741