Cargando…

DNA sequence features underlying large-scale duplications and deletions in human

Copy number variants (CNVs) may cover up to 12% of the whole genome and have substantial impact on phenotypes. We used 5867 duplications and 33,181 deletions available from the 1000 Genomes Project to characterise genomic regions vulnerable to CNV formation and to identify sequence features characte...

Descripción completa

Detalles Bibliográficos
Autores principales: Kołomański, Mateusz, Szyda, Joanna, Frąszczak, Magdalena, Mielczarek, Magda
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9365719/
https://www.ncbi.nlm.nih.gov/pubmed/35590085
http://dx.doi.org/10.1007/s13353-022-00704-0
_version_ 1784765403192360960
author Kołomański, Mateusz
Szyda, Joanna
Frąszczak, Magdalena
Mielczarek, Magda
author_facet Kołomański, Mateusz
Szyda, Joanna
Frąszczak, Magdalena
Mielczarek, Magda
author_sort Kołomański, Mateusz
collection PubMed
description Copy number variants (CNVs) may cover up to 12% of the whole genome and have substantial impact on phenotypes. We used 5867 duplications and 33,181 deletions available from the 1000 Genomes Project to characterise genomic regions vulnerable to CNV formation and to identify sequence features characteristic for those regions. The GC content for deletions was lower and for duplications was higher than for randomly selected regions. In regions flanking deletions and downstream of duplications, content was higher than in the random sequences, but upstream of duplication content was lower. In duplications and downstream of deletion regions, the percentage of low-complexity sequences was not different from the randomised data. In deletions and upstream of CNVs, it was higher, while for downstream of duplications, it was lower as compared to random sequences. The majority of CNVs intersected with genic regions — mainly with introns. GC content may be associated with CNV formation and CNVs, especially duplications are initiated in low-complexity regions. Moreover, CNVs located or overlapped with introns indicate their role in shaping intron variability. Genic CNV regions were enriched in many essential biological processes such as cell adhesion, synaptic transmission, transport, cytoskeleton organization, immune response and metabolic mechanisms, which indicates that these large-scaled variants play important biological roles. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s13353-022-00704-0.
format Online
Article
Text
id pubmed-9365719
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-93657192022-08-12 DNA sequence features underlying large-scale duplications and deletions in human Kołomański, Mateusz Szyda, Joanna Frąszczak, Magdalena Mielczarek, Magda J Appl Genet Human Genetics • Original Paper Copy number variants (CNVs) may cover up to 12% of the whole genome and have substantial impact on phenotypes. We used 5867 duplications and 33,181 deletions available from the 1000 Genomes Project to characterise genomic regions vulnerable to CNV formation and to identify sequence features characteristic for those regions. The GC content for deletions was lower and for duplications was higher than for randomly selected regions. In regions flanking deletions and downstream of duplications, content was higher than in the random sequences, but upstream of duplication content was lower. In duplications and downstream of deletion regions, the percentage of low-complexity sequences was not different from the randomised data. In deletions and upstream of CNVs, it was higher, while for downstream of duplications, it was lower as compared to random sequences. The majority of CNVs intersected with genic regions — mainly with introns. GC content may be associated with CNV formation and CNVs, especially duplications are initiated in low-complexity regions. Moreover, CNVs located or overlapped with introns indicate their role in shaping intron variability. Genic CNV regions were enriched in many essential biological processes such as cell adhesion, synaptic transmission, transport, cytoskeleton organization, immune response and metabolic mechanisms, which indicates that these large-scaled variants play important biological roles. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s13353-022-00704-0. Springer Berlin Heidelberg 2022-05-20 2022 /pmc/articles/PMC9365719/ /pubmed/35590085 http://dx.doi.org/10.1007/s13353-022-00704-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Human Genetics • Original Paper
Kołomański, Mateusz
Szyda, Joanna
Frąszczak, Magdalena
Mielczarek, Magda
DNA sequence features underlying large-scale duplications and deletions in human
title DNA sequence features underlying large-scale duplications and deletions in human
title_full DNA sequence features underlying large-scale duplications and deletions in human
title_fullStr DNA sequence features underlying large-scale duplications and deletions in human
title_full_unstemmed DNA sequence features underlying large-scale duplications and deletions in human
title_short DNA sequence features underlying large-scale duplications and deletions in human
title_sort dna sequence features underlying large-scale duplications and deletions in human
topic Human Genetics • Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9365719/
https://www.ncbi.nlm.nih.gov/pubmed/35590085
http://dx.doi.org/10.1007/s13353-022-00704-0
work_keys_str_mv AT kołomanskimateusz dnasequencefeaturesunderlyinglargescaleduplicationsanddeletionsinhuman
AT szydajoanna dnasequencefeaturesunderlyinglargescaleduplicationsanddeletionsinhuman
AT fraszczakmagdalena dnasequencefeaturesunderlyinglargescaleduplicationsanddeletionsinhuman
AT mielczarekmagda dnasequencefeaturesunderlyinglargescaleduplicationsanddeletionsinhuman