Cargando…

Targeted genotyping of variable number tandem repeats with adVNTR

Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Numbe...

Descripción completa

Detalles Bibliográficos
Autores principales: Bakhtiari, Mehrdad, Shleizer-Burko, Sharona, Gymrek, Melissa, Bansal, Vikas, Bafna, Vineet
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211647/
https://www.ncbi.nlm.nih.gov/pubmed/30352806
http://dx.doi.org/10.1101/gr.235119.118
_version_ 1783367377093656576
author Bakhtiari, Mehrdad
Shleizer-Burko, Sharona
Gymrek, Melissa
Bansal, Vikas
Bafna, Vineet
author_facet Bakhtiari, Mehrdad
Shleizer-Burko, Sharona
Gymrek, Melissa
Bansal, Vikas
Bafna, Vineet
author_sort Bakhtiari, Mehrdad
collection PubMed
description Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6–100 bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. Although existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole-genome sequencing reads remains challenging. We describe a method, adVNTR, that uses hidden Markov models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single-molecule (Pacific Biosciences [PacBio]) whole-genome and whole-exome sequencing, and show good results on multiple simulated and real data sets.
format Online
Article
Text
id pubmed-6211647
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-62116472019-05-01 Targeted genotyping of variable number tandem repeats with adVNTR Bakhtiari, Mehrdad Shleizer-Burko, Sharona Gymrek, Melissa Bansal, Vikas Bafna, Vineet Genome Res Method Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6–100 bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. Although existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole-genome sequencing reads remains challenging. We describe a method, adVNTR, that uses hidden Markov models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single-molecule (Pacific Biosciences [PacBio]) whole-genome and whole-exome sequencing, and show good results on multiple simulated and real data sets. Cold Spring Harbor Laboratory Press 2018-11 /pmc/articles/PMC6211647/ /pubmed/30352806 http://dx.doi.org/10.1101/gr.235119.118 Text en © 2018 Bakhtiari et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Bakhtiari, Mehrdad
Shleizer-Burko, Sharona
Gymrek, Melissa
Bansal, Vikas
Bafna, Vineet
Targeted genotyping of variable number tandem repeats with adVNTR
title Targeted genotyping of variable number tandem repeats with adVNTR
title_full Targeted genotyping of variable number tandem repeats with adVNTR
title_fullStr Targeted genotyping of variable number tandem repeats with adVNTR
title_full_unstemmed Targeted genotyping of variable number tandem repeats with adVNTR
title_short Targeted genotyping of variable number tandem repeats with adVNTR
title_sort targeted genotyping of variable number tandem repeats with advntr
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211647/
https://www.ncbi.nlm.nih.gov/pubmed/30352806
http://dx.doi.org/10.1101/gr.235119.118
work_keys_str_mv AT bakhtiarimehrdad targetedgenotypingofvariablenumbertandemrepeatswithadvntr
AT shleizerburkosharona targetedgenotypingofvariablenumbertandemrepeatswithadvntr
AT gymrekmelissa targetedgenotypingofvariablenumbertandemrepeatswithadvntr
AT bansalvikas targetedgenotypingofvariablenumbertandemrepeatswithadvntr
AT bafnavineet targetedgenotypingofvariablenumbertandemrepeatswithadvntr