Cargando…
Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words
The Lyndon factorization of a word has been extensively studied in different contexts and several variants of it have been proposed. In particular, the canonical inverse Lyndon factorization [Formula: see text], introduced in [5], maintains the main properties of the Lyndon factorization since it ca...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206627/ http://dx.doi.org/10.1007/978-3-030-40608-0_27 |
_version_ | 1783530446209941504 |
---|---|
author | Bonizzoni, Paola De Felice, Clelia Zaccagnino, Rocco Zizza, Rosalba |
author_facet | Bonizzoni, Paola De Felice, Clelia Zaccagnino, Rocco Zizza, Rosalba |
author_sort | Bonizzoni, Paola |
collection | PubMed |
description | The Lyndon factorization of a word has been extensively studied in different contexts and several variants of it have been proposed. In particular, the canonical inverse Lyndon factorization [Formula: see text], introduced in [5], maintains the main properties of the Lyndon factorization since it can be computed in linear time and it is uniquely determined. In this paper we investigate new properties of this factorization with the purpose of exploring its use in string queries. As a main result, we prove an upper bound on the length of the longest common extension (or longest common prefix) for two factors of a word w. This bound is at most the maximum length of two consecutive factors of [Formula: see text]. A tool used in the proof is a property that we state for factors with nonempty borders in [Formula: see text]: a nonempty border of a factor [Formula: see text] cannot be a prefix of the next factor [Formula: see text]. Another interesting result relates sorting of global suffixes, i.e., suffixes of a word w, and sorting of local suffixes, i.e., suffixes of the factors in [Formula: see text]. Finally, given a word w and a factor x of w, we prove that their Lyndon factorizations share factors, except for the first and last term of the Lyndon factorization of x. This property suggests that, given two words sharing a common overlap, their Lyndon factorizations could be used to capture the common overlap of these two words. |
format | Online Article Text |
id | pubmed-7206627 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-72066272020-05-08 Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words Bonizzoni, Paola De Felice, Clelia Zaccagnino, Rocco Zizza, Rosalba Language and Automata Theory and Applications Article The Lyndon factorization of a word has been extensively studied in different contexts and several variants of it have been proposed. In particular, the canonical inverse Lyndon factorization [Formula: see text], introduced in [5], maintains the main properties of the Lyndon factorization since it can be computed in linear time and it is uniquely determined. In this paper we investigate new properties of this factorization with the purpose of exploring its use in string queries. As a main result, we prove an upper bound on the length of the longest common extension (or longest common prefix) for two factors of a word w. This bound is at most the maximum length of two consecutive factors of [Formula: see text]. A tool used in the proof is a property that we state for factors with nonempty borders in [Formula: see text]: a nonempty border of a factor [Formula: see text] cannot be a prefix of the next factor [Formula: see text]. Another interesting result relates sorting of global suffixes, i.e., suffixes of a word w, and sorting of local suffixes, i.e., suffixes of the factors in [Formula: see text]. Finally, given a word w and a factor x of w, we prove that their Lyndon factorizations share factors, except for the first and last term of the Lyndon factorization of x. This property suggests that, given two words sharing a common overlap, their Lyndon factorizations could be used to capture the common overlap of these two words. 2020-01-07 /pmc/articles/PMC7206627/ http://dx.doi.org/10.1007/978-3-030-40608-0_27 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Bonizzoni, Paola De Felice, Clelia Zaccagnino, Rocco Zizza, Rosalba Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words |
title | Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words |
title_full | Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words |
title_fullStr | Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words |
title_full_unstemmed | Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words |
title_short | Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words |
title_sort | lyndon words versus inverse lyndon words: queries on suffixes and bordered words |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206627/ http://dx.doi.org/10.1007/978-3-030-40608-0_27 |
work_keys_str_mv | AT bonizzonipaola lyndonwordsversusinverselyndonwordsqueriesonsuffixesandborderedwords AT defeliceclelia lyndonwordsversusinverselyndonwordsqueriesonsuffixesandborderedwords AT zaccagninorocco lyndonwordsversusinverselyndonwordsqueriesonsuffixesandborderedwords AT zizzarosalba lyndonwordsversusinverselyndonwordsqueriesonsuffixesandborderedwords |