Cargando…
Inferring Restricted Regular Expressions with Interleaving from Positive and Negative Samples
The presence of a schema for XML documents has numerous advantages. Unfortunately, many XML documents in practice are not accompanied by a schema or a valid schema. Therefore, it is essential to devise algorithms to infer schemas. The fundamental task in XML schema inference is to learn regular expr...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206278/ http://dx.doi.org/10.1007/978-3-030-47436-2_58 |
_version_ | 1783530384337666048 |
---|---|
author | Li, Yeting Chen, Haiming Zhang, Lingqi Huang, Bo Zhang, Jianzhao |
author_facet | Li, Yeting Chen, Haiming Zhang, Lingqi Huang, Bo Zhang, Jianzhao |
author_sort | Li, Yeting |
collection | PubMed |
description | The presence of a schema for XML documents has numerous advantages. Unfortunately, many XML documents in practice are not accompanied by a schema or a valid schema. Therefore, it is essential to devise algorithms to infer schemas. The fundamental task in XML schema inference is to learn regular expressions. In this paper, we focus on learning the subclass of RE(&) called SIREs (the subclass of regular expressions with interleaving). Previous work in this direction lacks inference algorithms that support inference from positive and negative examples. We provide an algorithm to learn SIREs from positive and negative examples based on genetic algorithms and parallel techniques. Our algorithm also has better expansibility, which means that our algorithm not only supports learning with positive and negative examples, but also supports learning with positive or negative examples only. Experimental results demonstrate the effectiveness of our algorithm. |
format | Online Article Text |
id | pubmed-7206278 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-72062782020-05-08 Inferring Restricted Regular Expressions with Interleaving from Positive and Negative Samples Li, Yeting Chen, Haiming Zhang, Lingqi Huang, Bo Zhang, Jianzhao Advances in Knowledge Discovery and Data Mining Article The presence of a schema for XML documents has numerous advantages. Unfortunately, many XML documents in practice are not accompanied by a schema or a valid schema. Therefore, it is essential to devise algorithms to infer schemas. The fundamental task in XML schema inference is to learn regular expressions. In this paper, we focus on learning the subclass of RE(&) called SIREs (the subclass of regular expressions with interleaving). Previous work in this direction lacks inference algorithms that support inference from positive and negative examples. We provide an algorithm to learn SIREs from positive and negative examples based on genetic algorithms and parallel techniques. Our algorithm also has better expansibility, which means that our algorithm not only supports learning with positive and negative examples, but also supports learning with positive or negative examples only. Experimental results demonstrate the effectiveness of our algorithm. 2020-04-17 /pmc/articles/PMC7206278/ http://dx.doi.org/10.1007/978-3-030-47436-2_58 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Li, Yeting Chen, Haiming Zhang, Lingqi Huang, Bo Zhang, Jianzhao Inferring Restricted Regular Expressions with Interleaving from Positive and Negative Samples |
title | Inferring Restricted Regular Expressions with Interleaving from Positive and Negative Samples |
title_full | Inferring Restricted Regular Expressions with Interleaving from Positive and Negative Samples |
title_fullStr | Inferring Restricted Regular Expressions with Interleaving from Positive and Negative Samples |
title_full_unstemmed | Inferring Restricted Regular Expressions with Interleaving from Positive and Negative Samples |
title_short | Inferring Restricted Regular Expressions with Interleaving from Positive and Negative Samples |
title_sort | inferring restricted regular expressions with interleaving from positive and negative samples |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206278/ http://dx.doi.org/10.1007/978-3-030-47436-2_58 |
work_keys_str_mv | AT liyeting inferringrestrictedregularexpressionswithinterleavingfrompositiveandnegativesamples AT chenhaiming inferringrestrictedregularexpressionswithinterleavingfrompositiveandnegativesamples AT zhanglingqi inferringrestrictedregularexpressionswithinterleavingfrompositiveandnegativesamples AT huangbo inferringrestrictedregularexpressionswithinterleavingfrompositiveandnegativesamples AT zhangjianzhao inferringrestrictedregularexpressionswithinterleavingfrompositiveandnegativesamples |