Cargando…
Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets
Construction of knowledge repositories from web corpora by harvesting linguistic patterns is of benefit for many natural language-processing applications that rely on question-answering schemes. These methods require minimal or no human intervention and can recursively learn new relational facts-ins...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7318947/ https://www.ncbi.nlm.nih.gov/pubmed/32835308 http://dx.doi.org/10.1016/j.patter.2020.100053 |
_version_ | 1783550962060754944 |
---|---|
author | Moghaddam, Hoora Rezaei Ramanna, Sheela |
author_facet | Moghaddam, Hoora Rezaei Ramanna, Sheela |
author_sort | Moghaddam, Hoora Rezaei |
collection | PubMed |
description | Construction of knowledge repositories from web corpora by harvesting linguistic patterns is of benefit for many natural language-processing applications that rely on question-answering schemes. These methods require minimal or no human intervention and can recursively learn new relational facts-instances in a fully automated and scalable manner. This paper explores the performance of tolerance rough set-based learner with respect to two important issues: scalability and its effect on concept drift, by (1) designing a new version of the semi-supervised tolerance rough set-based pattern learner (TPL 2.0), (2) adapting a tolerance form of rough set methodology to categorize linguistic patterns, and (3) extracting categorical information from a large noisy dataset of crawled web pages. This work demonstrates that the TPL 2.0 learner is promising in terms of precision@30 metric when compared with three benchmark algorithms: Tolerant Pattern Learner 1.0, Fuzzy-Rough Set Pattern Learner, and Coupled Bayesian Sets-based learner. |
format | Online Article Text |
id | pubmed-7318947 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-73189472020-06-29 Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets Moghaddam, Hoora Rezaei Ramanna, Sheela Patterns (N Y) Article Construction of knowledge repositories from web corpora by harvesting linguistic patterns is of benefit for many natural language-processing applications that rely on question-answering schemes. These methods require minimal or no human intervention and can recursively learn new relational facts-instances in a fully automated and scalable manner. This paper explores the performance of tolerance rough set-based learner with respect to two important issues: scalability and its effect on concept drift, by (1) designing a new version of the semi-supervised tolerance rough set-based pattern learner (TPL 2.0), (2) adapting a tolerance form of rough set methodology to categorize linguistic patterns, and (3) extracting categorical information from a large noisy dataset of crawled web pages. This work demonstrates that the TPL 2.0 learner is promising in terms of precision@30 metric when compared with three benchmark algorithms: Tolerant Pattern Learner 1.0, Fuzzy-Rough Set Pattern Learner, and Coupled Bayesian Sets-based learner. Elsevier 2020-06-26 /pmc/articles/PMC7318947/ /pubmed/32835308 http://dx.doi.org/10.1016/j.patter.2020.100053 Text en © 2020 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Article Moghaddam, Hoora Rezaei Ramanna, Sheela Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets |
title | Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets |
title_full | Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets |
title_fullStr | Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets |
title_full_unstemmed | Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets |
title_short | Harvesting Patterns from Textual Web Sources with Tolerance Rough Sets |
title_sort | harvesting patterns from textual web sources with tolerance rough sets |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7318947/ https://www.ncbi.nlm.nih.gov/pubmed/32835308 http://dx.doi.org/10.1016/j.patter.2020.100053 |
work_keys_str_mv | AT moghaddamhoorarezaei harvestingpatternsfromtextualwebsourceswithtoleranceroughsets AT ramannasheela harvestingpatternsfromtextualwebsourceswithtoleranceroughsets |