Cargando…
Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs
Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275641/ https://www.ncbi.nlm.nih.gov/pubmed/34253730 http://dx.doi.org/10.1038/s41467-021-24378-0 |
_version_ | 1783721760067158016 |
---|---|
author | Lu, Tsung-Yu Chaisson, Mark J. P. |
author_facet | Lu, Tsung-Yu Chaisson, Mark J. P. |
author_sort | Lu, Tsung-Yu |
collection | PubMed |
description | Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease. |
format | Online Article Text |
id | pubmed-8275641 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-82756412021-07-20 Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs Lu, Tsung-Yu Chaisson, Mark J. P. Nat Commun Article Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease. Nature Publishing Group UK 2021-07-12 /pmc/articles/PMC8275641/ /pubmed/34253730 http://dx.doi.org/10.1038/s41467-021-24378-0 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Lu, Tsung-Yu Chaisson, Mark J. P. Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs |
title | Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs |
title_full | Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs |
title_fullStr | Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs |
title_full_unstemmed | Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs |
title_short | Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs |
title_sort | profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275641/ https://www.ncbi.nlm.nih.gov/pubmed/34253730 http://dx.doi.org/10.1038/s41467-021-24378-0 |
work_keys_str_mv | AT lutsungyu profilingvariablenumbertandemrepeatvariationacrosspopulationsusingrepeatpangenomegraphs AT profilingvariablenumbertandemrepeatvariationacrosspopulationsusingrepeatpangenomegraphs AT chaissonmarkjp profilingvariablenumbertandemrepeatvariationacrosspopulationsusingrepeatpangenomegraphs |