Cargando…

Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs

BACKGROUND: DiGeorge Syndrome is a genetic abnormality involving ~3 Mb deletion in human chromosome 22, termed 22q.11.2. To better understand the non-coding regions of 22q.11.2, a small 10,000 bp non-protein-coding sequence close to the DiGeorge Critical Region 6 gene (DGCR6) was chosen for analysis...

Descripción completa

Detalles Bibliográficos
Autor principal: Delihas, Nicholas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4607176/
https://www.ncbi.nlm.nih.gov/pubmed/26467088
http://dx.doi.org/10.1186/s12864-015-1958-6
_version_ 1782395472675078144
author Delihas, Nicholas
author_facet Delihas, Nicholas
author_sort Delihas, Nicholas
collection PubMed
description BACKGROUND: DiGeorge Syndrome is a genetic abnormality involving ~3 Mb deletion in human chromosome 22, termed 22q.11.2. To better understand the non-coding regions of 22q.11.2, a small 10,000 bp non-protein-coding sequence close to the DiGeorge Critical Region 6 gene (DGCR6) was chosen for analysis and functional entities as the homologous sequence in the chimpanzee genome could be aligned and used for comparisons. METHODS: The GenBank database provided genomic sequences. In silico computer programs were used to find homologous DNA sequences in human and chimpanzee genomes, generate random sequences, determine DNA sequence alignments, sequence comparisons and nucleotide repeat copies, and to predicted DNA secondary structures. RESULTS: At its 5′ half, the 10,000 bp sequence has three distinct sections that represent phylogenetically variable sequences. These Variable Regions contain biased mutations with a very high A + T content, multiple copies of the motif TATAATATA and sequences that fold into long A:T-base-paired stem loops. The 3′ half of the 10,000 bp unit, highly conserved between human and chimpanzee, has sequences representing exons of lncRNA genes and segments of introns of protein genes. Central to the 10,000 bp unit are the multiple copies of a sequence that originates from the flanking 5′ end of the translocation breakpoint Type A sequence. This breakpoint flanking sequence carries the exon and intron motifs. The breakpoint Type A sequence seems to be a major player in the proliferation of these RNA motifs, as well as the proliferation of Variable Regions in the 10,000 bp segment and other regions within 22q.11.2. CONCLUSIONS: The data indicate that a non-coding region of the chromosome may be reserved for highly biased mutations that lead to formation of specialized sequences and DNA secondary structures. On the other hand, the highly conserved nucleotide sequence of the non-coding region may form storage sites for RNA motifs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1958-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4607176
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46071762015-10-16 Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs Delihas, Nicholas BMC Genomics Research Article BACKGROUND: DiGeorge Syndrome is a genetic abnormality involving ~3 Mb deletion in human chromosome 22, termed 22q.11.2. To better understand the non-coding regions of 22q.11.2, a small 10,000 bp non-protein-coding sequence close to the DiGeorge Critical Region 6 gene (DGCR6) was chosen for analysis and functional entities as the homologous sequence in the chimpanzee genome could be aligned and used for comparisons. METHODS: The GenBank database provided genomic sequences. In silico computer programs were used to find homologous DNA sequences in human and chimpanzee genomes, generate random sequences, determine DNA sequence alignments, sequence comparisons and nucleotide repeat copies, and to predicted DNA secondary structures. RESULTS: At its 5′ half, the 10,000 bp sequence has three distinct sections that represent phylogenetically variable sequences. These Variable Regions contain biased mutations with a very high A + T content, multiple copies of the motif TATAATATA and sequences that fold into long A:T-base-paired stem loops. The 3′ half of the 10,000 bp unit, highly conserved between human and chimpanzee, has sequences representing exons of lncRNA genes and segments of introns of protein genes. Central to the 10,000 bp unit are the multiple copies of a sequence that originates from the flanking 5′ end of the translocation breakpoint Type A sequence. This breakpoint flanking sequence carries the exon and intron motifs. The breakpoint Type A sequence seems to be a major player in the proliferation of these RNA motifs, as well as the proliferation of Variable Regions in the 10,000 bp segment and other regions within 22q.11.2. CONCLUSIONS: The data indicate that a non-coding region of the chromosome may be reserved for highly biased mutations that lead to formation of specialized sequences and DNA secondary structures. On the other hand, the highly conserved nucleotide sequence of the non-coding region may form storage sites for RNA motifs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1958-6) contains supplementary material, which is available to authorized users. BioMed Central 2015-10-14 /pmc/articles/PMC4607176/ /pubmed/26467088 http://dx.doi.org/10.1186/s12864-015-1958-6 Text en © Delihas. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Delihas, Nicholas
Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs
title Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs
title_full Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs
title_fullStr Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs
title_full_unstemmed Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs
title_short Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs
title_sort complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized dna secondary structures and rna exon/intron motifs
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4607176/
https://www.ncbi.nlm.nih.gov/pubmed/26467088
http://dx.doi.org/10.1186/s12864-015-1958-6
work_keys_str_mv AT delihasnicholas complexityofasmallnonproteincodingsequenceinchromosomalregion22q112presenceofspecializeddnasecondarystructuresandrnaexonintronmotifs