Cargando…
Targeted genotyping of variable number tandem repeats with adVNTR
Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Numbe...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211647/ https://www.ncbi.nlm.nih.gov/pubmed/30352806 http://dx.doi.org/10.1101/gr.235119.118 |
_version_ | 1783367377093656576 |
---|---|
author | Bakhtiari, Mehrdad Shleizer-Burko, Sharona Gymrek, Melissa Bansal, Vikas Bafna, Vineet |
author_facet | Bakhtiari, Mehrdad Shleizer-Burko, Sharona Gymrek, Melissa Bansal, Vikas Bafna, Vineet |
author_sort | Bakhtiari, Mehrdad |
collection | PubMed |
description | Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6–100 bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. Although existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole-genome sequencing reads remains challenging. We describe a method, adVNTR, that uses hidden Markov models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single-molecule (Pacific Biosciences [PacBio]) whole-genome and whole-exome sequencing, and show good results on multiple simulated and real data sets. |
format | Online Article Text |
id | pubmed-6211647 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-62116472019-05-01 Targeted genotyping of variable number tandem repeats with adVNTR Bakhtiari, Mehrdad Shleizer-Burko, Sharona Gymrek, Melissa Bansal, Vikas Bafna, Vineet Genome Res Method Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6–100 bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. Although existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole-genome sequencing reads remains challenging. We describe a method, adVNTR, that uses hidden Markov models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single-molecule (Pacific Biosciences [PacBio]) whole-genome and whole-exome sequencing, and show good results on multiple simulated and real data sets. Cold Spring Harbor Laboratory Press 2018-11 /pmc/articles/PMC6211647/ /pubmed/30352806 http://dx.doi.org/10.1101/gr.235119.118 Text en © 2018 Bakhtiari et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/. |
spellingShingle | Method Bakhtiari, Mehrdad Shleizer-Burko, Sharona Gymrek, Melissa Bansal, Vikas Bafna, Vineet Targeted genotyping of variable number tandem repeats with adVNTR |
title | Targeted genotyping of variable number tandem repeats with adVNTR |
title_full | Targeted genotyping of variable number tandem repeats with adVNTR |
title_fullStr | Targeted genotyping of variable number tandem repeats with adVNTR |
title_full_unstemmed | Targeted genotyping of variable number tandem repeats with adVNTR |
title_short | Targeted genotyping of variable number tandem repeats with adVNTR |
title_sort | targeted genotyping of variable number tandem repeats with advntr |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211647/ https://www.ncbi.nlm.nih.gov/pubmed/30352806 http://dx.doi.org/10.1101/gr.235119.118 |
work_keys_str_mv | AT bakhtiarimehrdad targetedgenotypingofvariablenumbertandemrepeatswithadvntr AT shleizerburkosharona targetedgenotypingofvariablenumbertandemrepeatswithadvntr AT gymrekmelissa targetedgenotypingofvariablenumbertandemrepeatswithadvntr AT bansalvikas targetedgenotypingofvariablenumbertandemrepeatswithadvntr AT bafnavineet targetedgenotypingofvariablenumbertandemrepeatswithadvntr |