Cargando…

Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain

The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de...

Descripción completa

Detalles Bibliográficos
Autores principales: Linthorst, Jasper, Meert, Wim, Hestand, Matthew S., Korlach, Jonas, Vermeesch, Joris Robert, Reinders, Marcel J. T., Holstege, Henne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7608644/
https://www.ncbi.nlm.nih.gov/pubmed/33139705
http://dx.doi.org/10.1038/s41398-020-01060-5
_version_ 1783604877385007104
author Linthorst, Jasper
Meert, Wim
Hestand, Matthew S.
Korlach, Jonas
Vermeesch, Joris Robert
Reinders, Marcel J. T.
Holstege, Henne
author_facet Linthorst, Jasper
Meert, Wim
Hestand, Matthew S.
Korlach, Jonas
Vermeesch, Joris Robert
Reinders, Marcel J. T.
Holstege, Henne
author_sort Linthorst, Jasper
collection PubMed
description The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de novo assembly of the diploid genome of a 115-year-old Dutch cognitively healthy woman. We combined this assembly with two previously published haploid assemblies (CHM1 and CHM13) and the GRCh38 reference genome to create a compendium of SVs that occur across five independent human haplotypes using the graph-based multi-genome aligner REVEAL. Across these five haplotypes, we detected 31,680 euchromatic SVs (>50 bp). Of these, ~62% were comprised of repetitive sequences with ‘variable number tandem repeats’ (VNTRs), ~10% were mobile elements (Alu, L1, and SVA), while the remaining variants were inversions and indels. We observed that VNTRs with GC-content >60% and repeat patterns longer than 15 bp were 21-fold enriched in the subtelomeric regions (within 5 Mb of the ends of chromosome arms). VNTR lengths can expand to exceed a critical length which is associated with impaired gene transcription. The genes that contained most VNTRs, of which PTPRN2 and DLGAP2 are the most prominent examples, were found to be predominantly expressed in the brain and associated with a wide variety of neurological disorders. Repeat-induced variation represents a sizeable fraction of the genetic variation in human genomes and should be included in investigations of genetic factors associated with phenotypic traits, specifically those associated with neurological disorders. We make available the long and short-read sequence data of the supercentenarian genome, and a compendium of SVs as identified across 5 human haplotypes.
format Online
Article
Text
id pubmed-7608644
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-76086442020-11-04 Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain Linthorst, Jasper Meert, Wim Hestand, Matthew S. Korlach, Jonas Vermeesch, Joris Robert Reinders, Marcel J. T. Holstege, Henne Transl Psychiatry Article The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de novo assembly of the diploid genome of a 115-year-old Dutch cognitively healthy woman. We combined this assembly with two previously published haploid assemblies (CHM1 and CHM13) and the GRCh38 reference genome to create a compendium of SVs that occur across five independent human haplotypes using the graph-based multi-genome aligner REVEAL. Across these five haplotypes, we detected 31,680 euchromatic SVs (>50 bp). Of these, ~62% were comprised of repetitive sequences with ‘variable number tandem repeats’ (VNTRs), ~10% were mobile elements (Alu, L1, and SVA), while the remaining variants were inversions and indels. We observed that VNTRs with GC-content >60% and repeat patterns longer than 15 bp were 21-fold enriched in the subtelomeric regions (within 5 Mb of the ends of chromosome arms). VNTR lengths can expand to exceed a critical length which is associated with impaired gene transcription. The genes that contained most VNTRs, of which PTPRN2 and DLGAP2 are the most prominent examples, were found to be predominantly expressed in the brain and associated with a wide variety of neurological disorders. Repeat-induced variation represents a sizeable fraction of the genetic variation in human genomes and should be included in investigations of genetic factors associated with phenotypic traits, specifically those associated with neurological disorders. We make available the long and short-read sequence data of the supercentenarian genome, and a compendium of SVs as identified across 5 human haplotypes. Nature Publishing Group UK 2020-11-02 /pmc/articles/PMC7608644/ /pubmed/33139705 http://dx.doi.org/10.1038/s41398-020-01060-5 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Linthorst, Jasper
Meert, Wim
Hestand, Matthew S.
Korlach, Jonas
Vermeesch, Joris Robert
Reinders, Marcel J. T.
Holstege, Henne
Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain
title Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain
title_full Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain
title_fullStr Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain
title_full_unstemmed Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain
title_short Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain
title_sort extreme enrichment of vntr-associated polymorphicity in human subtelomeres: genes with most vntrs are predominantly expressed in the brain
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7608644/
https://www.ncbi.nlm.nih.gov/pubmed/33139705
http://dx.doi.org/10.1038/s41398-020-01060-5
work_keys_str_mv AT linthorstjasper extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain
AT meertwim extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain
AT hestandmatthews extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain
AT korlachjonas extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain
AT vermeeschjorisrobert extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain
AT reindersmarceljt extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain
AT holstegehenne extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain