Cargando…

Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs

Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read...

Descripción completa

Detalles Bibliográficos
Autores principales: Lu, Tsung-Yu, Chaisson, Mark J. P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275641/
https://www.ncbi.nlm.nih.gov/pubmed/34253730
http://dx.doi.org/10.1038/s41467-021-24378-0
_version_ 1783721760067158016
author Lu, Tsung-Yu
Chaisson, Mark J. P.
author_facet Lu, Tsung-Yu
Chaisson, Mark J. P.
author_sort Lu, Tsung-Yu
collection PubMed
description Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease.
format Online
Article
Text
id pubmed-8275641
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-82756412021-07-20 Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs Lu, Tsung-Yu Chaisson, Mark J. P. Nat Commun Article Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease. Nature Publishing Group UK 2021-07-12 /pmc/articles/PMC8275641/ /pubmed/34253730 http://dx.doi.org/10.1038/s41467-021-24378-0 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Lu, Tsung-Yu
Chaisson, Mark J. P.
Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs
title Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs
title_full Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs
title_fullStr Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs
title_full_unstemmed Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs
title_short Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs
title_sort profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275641/
https://www.ncbi.nlm.nih.gov/pubmed/34253730
http://dx.doi.org/10.1038/s41467-021-24378-0
work_keys_str_mv AT lutsungyu profilingvariablenumbertandemrepeatvariationacrosspopulationsusingrepeatpangenomegraphs
AT profilingvariablenumbertandemrepeatvariationacrosspopulationsusingrepeatpangenomegraphs
AT chaissonmarkjp profilingvariablenumbertandemrepeatvariationacrosspopulationsusingrepeatpangenomegraphs