Cargando…
Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain
The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7608644/ https://www.ncbi.nlm.nih.gov/pubmed/33139705 http://dx.doi.org/10.1038/s41398-020-01060-5 |
_version_ | 1783604877385007104 |
---|---|
author | Linthorst, Jasper Meert, Wim Hestand, Matthew S. Korlach, Jonas Vermeesch, Joris Robert Reinders, Marcel J. T. Holstege, Henne |
author_facet | Linthorst, Jasper Meert, Wim Hestand, Matthew S. Korlach, Jonas Vermeesch, Joris Robert Reinders, Marcel J. T. Holstege, Henne |
author_sort | Linthorst, Jasper |
collection | PubMed |
description | The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de novo assembly of the diploid genome of a 115-year-old Dutch cognitively healthy woman. We combined this assembly with two previously published haploid assemblies (CHM1 and CHM13) and the GRCh38 reference genome to create a compendium of SVs that occur across five independent human haplotypes using the graph-based multi-genome aligner REVEAL. Across these five haplotypes, we detected 31,680 euchromatic SVs (>50 bp). Of these, ~62% were comprised of repetitive sequences with ‘variable number tandem repeats’ (VNTRs), ~10% were mobile elements (Alu, L1, and SVA), while the remaining variants were inversions and indels. We observed that VNTRs with GC-content >60% and repeat patterns longer than 15 bp were 21-fold enriched in the subtelomeric regions (within 5 Mb of the ends of chromosome arms). VNTR lengths can expand to exceed a critical length which is associated with impaired gene transcription. The genes that contained most VNTRs, of which PTPRN2 and DLGAP2 are the most prominent examples, were found to be predominantly expressed in the brain and associated with a wide variety of neurological disorders. Repeat-induced variation represents a sizeable fraction of the genetic variation in human genomes and should be included in investigations of genetic factors associated with phenotypic traits, specifically those associated with neurological disorders. We make available the long and short-read sequence data of the supercentenarian genome, and a compendium of SVs as identified across 5 human haplotypes. |
format | Online Article Text |
id | pubmed-7608644 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-76086442020-11-04 Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain Linthorst, Jasper Meert, Wim Hestand, Matthew S. Korlach, Jonas Vermeesch, Joris Robert Reinders, Marcel J. T. Holstege, Henne Transl Psychiatry Article The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de novo assembly of the diploid genome of a 115-year-old Dutch cognitively healthy woman. We combined this assembly with two previously published haploid assemblies (CHM1 and CHM13) and the GRCh38 reference genome to create a compendium of SVs that occur across five independent human haplotypes using the graph-based multi-genome aligner REVEAL. Across these five haplotypes, we detected 31,680 euchromatic SVs (>50 bp). Of these, ~62% were comprised of repetitive sequences with ‘variable number tandem repeats’ (VNTRs), ~10% were mobile elements (Alu, L1, and SVA), while the remaining variants were inversions and indels. We observed that VNTRs with GC-content >60% and repeat patterns longer than 15 bp were 21-fold enriched in the subtelomeric regions (within 5 Mb of the ends of chromosome arms). VNTR lengths can expand to exceed a critical length which is associated with impaired gene transcription. The genes that contained most VNTRs, of which PTPRN2 and DLGAP2 are the most prominent examples, were found to be predominantly expressed in the brain and associated with a wide variety of neurological disorders. Repeat-induced variation represents a sizeable fraction of the genetic variation in human genomes and should be included in investigations of genetic factors associated with phenotypic traits, specifically those associated with neurological disorders. We make available the long and short-read sequence data of the supercentenarian genome, and a compendium of SVs as identified across 5 human haplotypes. Nature Publishing Group UK 2020-11-02 /pmc/articles/PMC7608644/ /pubmed/33139705 http://dx.doi.org/10.1038/s41398-020-01060-5 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Linthorst, Jasper Meert, Wim Hestand, Matthew S. Korlach, Jonas Vermeesch, Joris Robert Reinders, Marcel J. T. Holstege, Henne Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain |
title | Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain |
title_full | Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain |
title_fullStr | Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain |
title_full_unstemmed | Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain |
title_short | Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain |
title_sort | extreme enrichment of vntr-associated polymorphicity in human subtelomeres: genes with most vntrs are predominantly expressed in the brain |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7608644/ https://www.ncbi.nlm.nih.gov/pubmed/33139705 http://dx.doi.org/10.1038/s41398-020-01060-5 |
work_keys_str_mv | AT linthorstjasper extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain AT meertwim extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain AT hestandmatthews extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain AT korlachjonas extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain AT vermeeschjorisrobert extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain AT reindersmarceljt extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain AT holstegehenne extremeenrichmentofvntrassociatedpolymorphicityinhumansubtelomeresgeneswithmostvntrsarepredominantlyexpressedinthebrain |