Cargando…

Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models

A weak palindromic nucleotide motif is the hallmark of retroviral integration site alignments. Given that the majority of target sequences are not palindromic, the current model explains the symmetry by an overlap of the nonpalindromic motif present on one of the half-sites of the sequences. Here, w...

Descripción completa

Detalles Bibliográficos
Autores principales: Miklík, Dalibor, Grim, Jiří, Elleder, Daniel, Hejnar, Jiří
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547254/
https://www.ncbi.nlm.nih.gov/pubmed/37463751
http://dx.doi.org/10.1101/gr.277694.123
_version_ 1785115021594853376
author Miklík, Dalibor
Grim, Jiří
Elleder, Daniel
Hejnar, Jiří
author_facet Miklík, Dalibor
Grim, Jiří
Elleder, Daniel
Hejnar, Jiří
author_sort Miklík, Dalibor
collection PubMed
description A weak palindromic nucleotide motif is the hallmark of retroviral integration site alignments. Given that the majority of target sequences are not palindromic, the current model explains the symmetry by an overlap of the nonpalindromic motif present on one of the half-sites of the sequences. Here, we show that the implementation of multicomponent mixture models allows for different interpretations consistent with the existence of both palindromic and nonpalindromic submotifs in the sets of integration site sequences. We further show that the weak palindromic motifs result from freely combined site-specific submotifs restricted to only a few positions proximal to the site of integration. The submotifs are formed by either palindrome-forming nucleotide preference or nucleotide exclusion. Using the mixture models, we also identify HIV-1-favored palindromic sequences in Alu repeats serving as local hotspots for integration. The application of the novel statistical approach provides deeper insight into the selection of retroviral integration sites and may prove to be a valuable tool in the analysis of any type of DNA motifs.
format Online
Article
Text
id pubmed-10547254
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-105472542023-10-04 Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models Miklík, Dalibor Grim, Jiří Elleder, Daniel Hejnar, Jiří Genome Res Methods A weak palindromic nucleotide motif is the hallmark of retroviral integration site alignments. Given that the majority of target sequences are not palindromic, the current model explains the symmetry by an overlap of the nonpalindromic motif present on one of the half-sites of the sequences. Here, we show that the implementation of multicomponent mixture models allows for different interpretations consistent with the existence of both palindromic and nonpalindromic submotifs in the sets of integration site sequences. We further show that the weak palindromic motifs result from freely combined site-specific submotifs restricted to only a few positions proximal to the site of integration. The submotifs are formed by either palindrome-forming nucleotide preference or nucleotide exclusion. Using the mixture models, we also identify HIV-1-favored palindromic sequences in Alu repeats serving as local hotspots for integration. The application of the novel statistical approach provides deeper insight into the selection of retroviral integration sites and may prove to be a valuable tool in the analysis of any type of DNA motifs. Cold Spring Harbor Laboratory Press 2023-08 /pmc/articles/PMC10547254/ /pubmed/37463751 http://dx.doi.org/10.1101/gr.277694.123 Text en © 2023 Miklík et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Methods
Miklík, Dalibor
Grim, Jiří
Elleder, Daniel
Hejnar, Jiří
Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models
title Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models
title_full Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models
title_fullStr Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models
title_full_unstemmed Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models
title_short Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models
title_sort unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547254/
https://www.ncbi.nlm.nih.gov/pubmed/37463751
http://dx.doi.org/10.1101/gr.277694.123
work_keys_str_mv AT miklikdalibor unravelingthepalindromicandnonpalindromicmotifsofretroviralintegrationsitesequencesbystatisticalmixturemodels
AT grimjiri unravelingthepalindromicandnonpalindromicmotifsofretroviralintegrationsitesequencesbystatisticalmixturemodels
AT ellederdaniel unravelingthepalindromicandnonpalindromicmotifsofretroviralintegrationsitesequencesbystatisticalmixturemodels
AT hejnarjiri unravelingthepalindromicandnonpalindromicmotifsofretroviralintegrationsitesequencesbystatisticalmixturemodels