Cargando…

Intronic CNVs and gene expression variation in human populations

Introns can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have fou...

Descripción completa

Detalles Bibliográficos
Autores principales: Rigau, Maria, Juan, David, Valencia, Alfonso, Rico, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6345438/
https://www.ncbi.nlm.nih.gov/pubmed/30677042
http://dx.doi.org/10.1371/journal.pgen.1007902
_version_ 1783389567520342016
author Rigau, Maria
Juan, David
Valencia, Alfonso
Rico, Daniel
author_facet Rigau, Maria
Juan, David
Valencia, Alfonso
Rico, Daniel
author_sort Rigau, Maria
collection PubMed
description Introns can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have found that intronic losses are the most frequent copy number variants (CNVs) in protein-coding genes in human, with 12,986 intronic deletions, affecting 4,147 genes (including 1,154 essential genes and 1,638 disease-related genes). This intronic length variation results in dozens of genes showing extreme population variability in size, with 40 genes with 10 or more different sizes and up to 150 allelic sizes. Intronic losses are frequent in evolutionarily ancient genes that are highly conserved at the protein sequence level. This result contrasts with losses overlapping exons, which are observed less often than expected by chance and almost exclusively affect primate-specific genes. An integrated analysis of CNVs and RNA-seq data showed that intronic loss can be associated with significant differences in gene expression levels in the population (CNV-eQTLs). These intronic CNV-eQTLs regions are enriched for intronic enhancers and can be associated with expression differences of other genes showing long distance intron-promoter 3D interactions. Our data suggests that intronic structural variation of protein-coding genes makes an important contribution to the variability of gene expression and splicing in human populations.
format Online
Article
Text
id pubmed-6345438
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-63454382019-02-02 Intronic CNVs and gene expression variation in human populations Rigau, Maria Juan, David Valencia, Alfonso Rico, Daniel PLoS Genet Research Article Introns can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have found that intronic losses are the most frequent copy number variants (CNVs) in protein-coding genes in human, with 12,986 intronic deletions, affecting 4,147 genes (including 1,154 essential genes and 1,638 disease-related genes). This intronic length variation results in dozens of genes showing extreme population variability in size, with 40 genes with 10 or more different sizes and up to 150 allelic sizes. Intronic losses are frequent in evolutionarily ancient genes that are highly conserved at the protein sequence level. This result contrasts with losses overlapping exons, which are observed less often than expected by chance and almost exclusively affect primate-specific genes. An integrated analysis of CNVs and RNA-seq data showed that intronic loss can be associated with significant differences in gene expression levels in the population (CNV-eQTLs). These intronic CNV-eQTLs regions are enriched for intronic enhancers and can be associated with expression differences of other genes showing long distance intron-promoter 3D interactions. Our data suggests that intronic structural variation of protein-coding genes makes an important contribution to the variability of gene expression and splicing in human populations. Public Library of Science 2019-01-24 /pmc/articles/PMC6345438/ /pubmed/30677042 http://dx.doi.org/10.1371/journal.pgen.1007902 Text en © 2019 Rigau et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Rigau, Maria
Juan, David
Valencia, Alfonso
Rico, Daniel
Intronic CNVs and gene expression variation in human populations
title Intronic CNVs and gene expression variation in human populations
title_full Intronic CNVs and gene expression variation in human populations
title_fullStr Intronic CNVs and gene expression variation in human populations
title_full_unstemmed Intronic CNVs and gene expression variation in human populations
title_short Intronic CNVs and gene expression variation in human populations
title_sort intronic cnvs and gene expression variation in human populations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6345438/
https://www.ncbi.nlm.nih.gov/pubmed/30677042
http://dx.doi.org/10.1371/journal.pgen.1007902
work_keys_str_mv AT rigaumaria introniccnvsandgeneexpressionvariationinhumanpopulations
AT juandavid introniccnvsandgeneexpressionvariationinhumanpopulations
AT valenciaalfonso introniccnvsandgeneexpressionvariationinhumanpopulations
AT ricodaniel introniccnvsandgeneexpressionvariationinhumanpopulations