Cargando…

Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets

Construction of knowledge repositories from web corpora by harvesting linguistic patterns is of benefit for many natural language-processing applications that rely on question-answering schemes. These methods require minimal or no human intervention and can recursively learn new relational facts-ins...

Descripción completa

Detalles Bibliográficos
Autores principales:	Moghaddam, Hoora Rezaei, Ramanna, Sheela
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7318947/ https://www.ncbi.nlm.nih.gov/pubmed/32835308 http://dx.doi.org/10.1016/j.patter.2020.100053

Descripción
Sumario:	Construction of knowledge repositories from web corpora by harvesting linguistic patterns is of benefit for many natural language-processing applications that rely on question-answering schemes. These methods require minimal or no human intervention and can recursively learn new relational facts-instances in a fully automated and scalable manner. This paper explores the performance of tolerance rough set-based learner with respect to two important issues: scalability and its effect on concept drift, by (1) designing a new version of the semi-supervised tolerance rough set-based pattern learner (TPL 2.0), (2) adapting a tolerance form of rough set methodology to categorize linguistic patterns, and (3) extracting categorical information from a large noisy dataset of crawled web pages. This work demonstrates that the TPL 2.0 learner is promising in terms of precision@30 metric when compared with three benchmark algorithms: Tolerant Pattern Learner 1.0, Fuzzy-Rough Set Pattern Learner, and Coupled Bayesian Sets-based learner.

Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets

Ejemplares similares