Cargando…

Periodic pattern detection in sparse boolean sequences

BACKGROUND: The specific position of functionally related genes along the DNA has been shown to reflect the interplay between chromosome structure and genetic regulation. By investigating the statistical properties of the distances separating such genes, several studies have highlighted various peri...

Descripción completa

Detalles Bibliográficos
Autores principales: Junier, Ivan, Hérisson, Joan, Képès, François
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2949599/
https://www.ncbi.nlm.nih.gov/pubmed/20831781
http://dx.doi.org/10.1186/1748-7188-5-31
_version_ 1782187532242386944
author Junier, Ivan
Hérisson, Joan
Képès, François
author_facet Junier, Ivan
Hérisson, Joan
Képès, François
author_sort Junier, Ivan
collection PubMed
description BACKGROUND: The specific position of functionally related genes along the DNA has been shown to reflect the interplay between chromosome structure and genetic regulation. By investigating the statistical properties of the distances separating such genes, several studies have highlighted various periodic trends. In many cases, however, groups built up from co-functional or co-regulated genes are small and contain wrong information (data contamination) so that the statistics is poorly exploitable. In addition, gene positions are not expected to satisfy a perfectly ordered pattern along the DNA. Within this scope, we present an algorithm that aims to highlight periodic patterns in sparse boolean sequences, i.e. sequences of the type 010011011010... where the ratio of the number of 1's (denoting here the transcription start of a gene) to 0's is small. RESULTS: The algorithm is particularly robust with respect to strong signal distortions such as the addition of 1's at arbitrary positions (contaminated data), the deletion of existing 1's in the sequence (missing data) and the presence of disorder in the position of the 1's (noise). This robustness property stems from an appropriate exploitation of the remarkable alignment properties of periodic points in solenoidal coordinates. CONCLUSIONS: The efficiency of the algorithm is demonstrated in situations where standard Fourier-based spectral methods are poorly adapted. We also show how the proposed framework allows to identify the 1's that participate in the periodic trends, i.e. how the framework allows to allocate a positional score to genes, in the same spirit of the sequence score. The software is available for public use at http://www.issb.genopole.fr/MEGA/Softwares/iSSB_SolenoidalApplication.zip.
format Text
id pubmed-2949599
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29495992010-11-03 Periodic pattern detection in sparse boolean sequences Junier, Ivan Hérisson, Joan Képès, François Algorithms Mol Biol Software Article BACKGROUND: The specific position of functionally related genes along the DNA has been shown to reflect the interplay between chromosome structure and genetic regulation. By investigating the statistical properties of the distances separating such genes, several studies have highlighted various periodic trends. In many cases, however, groups built up from co-functional or co-regulated genes are small and contain wrong information (data contamination) so that the statistics is poorly exploitable. In addition, gene positions are not expected to satisfy a perfectly ordered pattern along the DNA. Within this scope, we present an algorithm that aims to highlight periodic patterns in sparse boolean sequences, i.e. sequences of the type 010011011010... where the ratio of the number of 1's (denoting here the transcription start of a gene) to 0's is small. RESULTS: The algorithm is particularly robust with respect to strong signal distortions such as the addition of 1's at arbitrary positions (contaminated data), the deletion of existing 1's in the sequence (missing data) and the presence of disorder in the position of the 1's (noise). This robustness property stems from an appropriate exploitation of the remarkable alignment properties of periodic points in solenoidal coordinates. CONCLUSIONS: The efficiency of the algorithm is demonstrated in situations where standard Fourier-based spectral methods are poorly adapted. We also show how the proposed framework allows to identify the 1's that participate in the periodic trends, i.e. how the framework allows to allocate a positional score to genes, in the same spirit of the sequence score. The software is available for public use at http://www.issb.genopole.fr/MEGA/Softwares/iSSB_SolenoidalApplication.zip. BioMed Central 2010-09-10 /pmc/articles/PMC2949599/ /pubmed/20831781 http://dx.doi.org/10.1186/1748-7188-5-31 Text en Copyright ©2010 Junier et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Article
Junier, Ivan
Hérisson, Joan
Képès, François
Periodic pattern detection in sparse boolean sequences
title Periodic pattern detection in sparse boolean sequences
title_full Periodic pattern detection in sparse boolean sequences
title_fullStr Periodic pattern detection in sparse boolean sequences
title_full_unstemmed Periodic pattern detection in sparse boolean sequences
title_short Periodic pattern detection in sparse boolean sequences
title_sort periodic pattern detection in sparse boolean sequences
topic Software Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2949599/
https://www.ncbi.nlm.nih.gov/pubmed/20831781
http://dx.doi.org/10.1186/1748-7188-5-31
work_keys_str_mv AT junierivan periodicpatterndetectioninsparsebooleansequences
AT herissonjoan periodicpatterndetectioninsparsebooleansequences
AT kepesfrancois periodicpatterndetectioninsparsebooleansequences