Cargando…

Dataset of bulged G-quadruplex forming sequences in the human genome

When several continuous guanine runs are present closely in a nucleic acid sequence, a secondary structure called G-quadruplex can form (G4s). Such structures in the genome could serve as structural and functional regulators in gene expression, DNA-protein binding, epigenetic modification, and genot...

Descripción completa

Detalles Bibliográficos
Autores principales: Papp, Csaba, Jenjaroenpun, Piroon, Mukundan, Vineeth T., Phan, Anh Tuân, Kuznetsov, Vladimir A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10515301/
https://www.ncbi.nlm.nih.gov/pubmed/37743888
http://dx.doi.org/10.1016/j.dib.2023.109550
_version_ 1785108920715444224
author Papp, Csaba
Jenjaroenpun, Piroon
Mukundan, Vineeth T.
Phan, Anh Tuân
Kuznetsov, Vladimir A.
author_facet Papp, Csaba
Jenjaroenpun, Piroon
Mukundan, Vineeth T.
Phan, Anh Tuân
Kuznetsov, Vladimir A.
author_sort Papp, Csaba
collection PubMed
description When several continuous guanine runs are present closely in a nucleic acid sequence, a secondary structure called G-quadruplex can form (G4s). Such structures in the genome could serve as structural and functional regulators in gene expression, DNA-protein binding, epigenetic modification, and genotoxic stress. Several types of G4-forming DNA sequences exist, including bulged G4-forming sequences (G4-BS). Such bulges occur due to the presence of non-guanine bases in specific locations (G-runs) in the G4-forming sequences. At present, search algorithms do not identify stable G4-BS conformations, making genome-wide studies of G4-like structures difficult. Data provided in this study are related to a published article "Stable bulged G-quadruplexes in the human genome: Identification, experimental validation and functionalization" published by Nucleic Acids Research [DIO.org/10.193/nar/gkad252]. Based on our studies in vitro and G4-seq and G4 CUT&Tag data analysis, we have specified and validated three pG4-BS models. In this article, a large collection of 'raw' (unfiltered) dataset is presented, which includes three subfamilies of pG4-BS. For each of pG4-BS, we provide strand-specific genomic boundaries. Data on pG4-BS might be useful in elucidating their structural, functional, and evolutionary roles. Furthermore, they may provide insight into the pathobiology of G4-like structures and their potential therapeutic applications.
format Online
Article
Text
id pubmed-10515301
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-105153012023-09-23 Dataset of bulged G-quadruplex forming sequences in the human genome Papp, Csaba Jenjaroenpun, Piroon Mukundan, Vineeth T. Phan, Anh Tuân Kuznetsov, Vladimir A. Data Brief Data Article When several continuous guanine runs are present closely in a nucleic acid sequence, a secondary structure called G-quadruplex can form (G4s). Such structures in the genome could serve as structural and functional regulators in gene expression, DNA-protein binding, epigenetic modification, and genotoxic stress. Several types of G4-forming DNA sequences exist, including bulged G4-forming sequences (G4-BS). Such bulges occur due to the presence of non-guanine bases in specific locations (G-runs) in the G4-forming sequences. At present, search algorithms do not identify stable G4-BS conformations, making genome-wide studies of G4-like structures difficult. Data provided in this study are related to a published article "Stable bulged G-quadruplexes in the human genome: Identification, experimental validation and functionalization" published by Nucleic Acids Research [DIO.org/10.193/nar/gkad252]. Based on our studies in vitro and G4-seq and G4 CUT&Tag data analysis, we have specified and validated three pG4-BS models. In this article, a large collection of 'raw' (unfiltered) dataset is presented, which includes three subfamilies of pG4-BS. For each of pG4-BS, we provide strand-specific genomic boundaries. Data on pG4-BS might be useful in elucidating their structural, functional, and evolutionary roles. Furthermore, they may provide insight into the pathobiology of G4-like structures and their potential therapeutic applications. Elsevier 2023-09-06 /pmc/articles/PMC10515301/ /pubmed/37743888 http://dx.doi.org/10.1016/j.dib.2023.109550 Text en Published by Elsevier Inc. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Data Article
Papp, Csaba
Jenjaroenpun, Piroon
Mukundan, Vineeth T.
Phan, Anh Tuân
Kuznetsov, Vladimir A.
Dataset of bulged G-quadruplex forming sequences in the human genome
title Dataset of bulged G-quadruplex forming sequences in the human genome
title_full Dataset of bulged G-quadruplex forming sequences in the human genome
title_fullStr Dataset of bulged G-quadruplex forming sequences in the human genome
title_full_unstemmed Dataset of bulged G-quadruplex forming sequences in the human genome
title_short Dataset of bulged G-quadruplex forming sequences in the human genome
title_sort dataset of bulged g-quadruplex forming sequences in the human genome
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10515301/
https://www.ncbi.nlm.nih.gov/pubmed/37743888
http://dx.doi.org/10.1016/j.dib.2023.109550
work_keys_str_mv AT pappcsaba datasetofbulgedgquadruplexformingsequencesinthehumangenome
AT jenjaroenpunpiroon datasetofbulgedgquadruplexformingsequencesinthehumangenome
AT mukundanvineetht datasetofbulgedgquadruplexformingsequencesinthehumangenome
AT phananhtuan datasetofbulgedgquadruplexformingsequencesinthehumangenome
AT kuznetsovvladimira datasetofbulgedgquadruplexformingsequencesinthehumangenome