Cargando…

Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words

The Lyndon factorization of a word has been extensively studied in different contexts and several variants of it have been proposed. In particular, the canonical inverse Lyndon factorization [Formula: see text], introduced in [5], maintains the main properties of the Lyndon factorization since it ca...

Descripción completa

Detalles Bibliográficos
Autores principales: Bonizzoni, Paola, De Felice, Clelia, Zaccagnino, Rocco, Zizza, Rosalba
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206627/
http://dx.doi.org/10.1007/978-3-030-40608-0_27
_version_ 1783530446209941504
author Bonizzoni, Paola
De Felice, Clelia
Zaccagnino, Rocco
Zizza, Rosalba
author_facet Bonizzoni, Paola
De Felice, Clelia
Zaccagnino, Rocco
Zizza, Rosalba
author_sort Bonizzoni, Paola
collection PubMed
description The Lyndon factorization of a word has been extensively studied in different contexts and several variants of it have been proposed. In particular, the canonical inverse Lyndon factorization [Formula: see text], introduced in [5], maintains the main properties of the Lyndon factorization since it can be computed in linear time and it is uniquely determined. In this paper we investigate new properties of this factorization with the purpose of exploring its use in string queries. As a main result, we prove an upper bound on the length of the longest common extension (or longest common prefix) for two factors of a word w. This bound is at most the maximum length of two consecutive factors of [Formula: see text]. A tool used in the proof is a property that we state for factors with nonempty borders in [Formula: see text]: a nonempty border of a factor [Formula: see text] cannot be a prefix of the next factor [Formula: see text]. Another interesting result relates sorting of global suffixes, i.e., suffixes of a word w, and sorting of local suffixes, i.e., suffixes of the factors in [Formula: see text]. Finally, given a word w and a factor x of w, we prove that their Lyndon factorizations share factors, except for the first and last term of the Lyndon factorization of x. This property suggests that, given two words sharing a common overlap, their Lyndon factorizations could be used to capture the common overlap of these two words.
format Online
Article
Text
id pubmed-7206627
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72066272020-05-08 Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words Bonizzoni, Paola De Felice, Clelia Zaccagnino, Rocco Zizza, Rosalba Language and Automata Theory and Applications Article The Lyndon factorization of a word has been extensively studied in different contexts and several variants of it have been proposed. In particular, the canonical inverse Lyndon factorization [Formula: see text], introduced in [5], maintains the main properties of the Lyndon factorization since it can be computed in linear time and it is uniquely determined. In this paper we investigate new properties of this factorization with the purpose of exploring its use in string queries. As a main result, we prove an upper bound on the length of the longest common extension (or longest common prefix) for two factors of a word w. This bound is at most the maximum length of two consecutive factors of [Formula: see text]. A tool used in the proof is a property that we state for factors with nonempty borders in [Formula: see text]: a nonempty border of a factor [Formula: see text] cannot be a prefix of the next factor [Formula: see text]. Another interesting result relates sorting of global suffixes, i.e., suffixes of a word w, and sorting of local suffixes, i.e., suffixes of the factors in [Formula: see text]. Finally, given a word w and a factor x of w, we prove that their Lyndon factorizations share factors, except for the first and last term of the Lyndon factorization of x. This property suggests that, given two words sharing a common overlap, their Lyndon factorizations could be used to capture the common overlap of these two words. 2020-01-07 /pmc/articles/PMC7206627/ http://dx.doi.org/10.1007/978-3-030-40608-0_27 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Bonizzoni, Paola
De Felice, Clelia
Zaccagnino, Rocco
Zizza, Rosalba
Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words
title Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words
title_full Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words
title_fullStr Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words
title_full_unstemmed Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words
title_short Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words
title_sort lyndon words versus inverse lyndon words: queries on suffixes and bordered words
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206627/
http://dx.doi.org/10.1007/978-3-030-40608-0_27
work_keys_str_mv AT bonizzonipaola lyndonwordsversusinverselyndonwordsqueriesonsuffixesandborderedwords
AT defeliceclelia lyndonwordsversusinverselyndonwordsqueriesonsuffixesandborderedwords
AT zaccagninorocco lyndonwordsversusinverselyndonwordsqueriesonsuffixesandborderedwords
AT zizzarosalba lyndonwordsversusinverselyndonwordsqueriesonsuffixesandborderedwords