Cargando…

Simple Semantics in Topic Detection and Tracking

Topic Detection and Tracking (TDT) is a research initiative that aims at techniques to organize news documents in terms of news events. We propose a method that incorporates simple semantics into TDT by splitting the term space into groups of terms that have the meaning of the same type. Such a grou...

Descripción completa

Detalles Bibliográficos
Autores principales: Makkonen, Juha, Ahonen-Myka, Helena, Salmenkivi, Marko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Kluwer Academic Publishers 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7088333/
https://www.ncbi.nlm.nih.gov/pubmed/32214876
http://dx.doi.org/10.1023/B:INRT.0000011210.12953.86
Descripción
Sumario:Topic Detection and Tracking (TDT) is a research initiative that aims at techniques to organize news documents in terms of news events. We propose a method that incorporates simple semantics into TDT by splitting the term space into groups of terms that have the meaning of the same type. Such a group can be associated with an external ontology. This ontology is used to determine the similarity of two terms in the given group. We extract proper names, locations, temporal expressions and normal terms into distinct sub-vectors of the document representation. Measuring the similarity of two documents is conducted by comparing a pair of their corresponding sub-vectors at a time. We use a simple perceptron to optimize the relative emphasis of each semantic class in the tracking and detection decisions. The results suggest that the spatial and the temporal similarity measures need to be improved. Especially the vagueness of spatial and temporal terms needs to be addressed.