Cargando…
VNTRseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data
DNA tandem repeats (TRs) are ubiquitous genomic features which consist of two or more adjacent copies of an underlying pattern sequence. The copies may be identical or approximate. Variable number of tandem repeats or VNTRs are polymorphic TR loci in which the number of pattern copies is variable. I...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4132751/ https://www.ncbi.nlm.nih.gov/pubmed/25056320 http://dx.doi.org/10.1093/nar/gku642 |
_version_ | 1782330673559764992 |
---|---|
author | Gelfand, Yevgeniy Hernandez, Yozen Loving, Joshua Benson, Gary |
author_facet | Gelfand, Yevgeniy Hernandez, Yozen Loving, Joshua Benson, Gary |
author_sort | Gelfand, Yevgeniy |
collection | PubMed |
description | DNA tandem repeats (TRs) are ubiquitous genomic features which consist of two or more adjacent copies of an underlying pattern sequence. The copies may be identical or approximate. Variable number of tandem repeats or VNTRs are polymorphic TR loci in which the number of pattern copies is variable. In this paper we describe VNTRseek, our software for discovery of minisatellite VNTRs (pattern size ≥ 7 nucleotides) using whole genome sequencing data. VNTRseek maps sequencing reads to a set of reference TRs and then identifies putative VNTRs based on a discrepancy between the copy number of a reference and its mapped reads. VNTRseek was used to analyze the Watson and Khoisan genomes (454 technology) and two 1000 Genomes family trios (Illumina). In the Watson genome, we identified 752 VNTRs with pattern sizes ranging from 7 to 84 nt. In the Khoisan genome, we identified 2572 VNTRs with pattern sizes ranging from 7 to 105 nt. In the trios, we identified between 2660 and 3822 VNTRs per individual and found nearly 100% consistency with Mendelian inheritance. VNTRseek is, to the best of our knowledge, the first software for genome-wide detection of minisatellite VNTRs. It is available at http://orca.bu.edu/vntrseek/. |
format | Online Article Text |
id | pubmed-4132751 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-41327512014-12-01 VNTRseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data Gelfand, Yevgeniy Hernandez, Yozen Loving, Joshua Benson, Gary Nucleic Acids Res Computational Biology DNA tandem repeats (TRs) are ubiquitous genomic features which consist of two or more adjacent copies of an underlying pattern sequence. The copies may be identical or approximate. Variable number of tandem repeats or VNTRs are polymorphic TR loci in which the number of pattern copies is variable. In this paper we describe VNTRseek, our software for discovery of minisatellite VNTRs (pattern size ≥ 7 nucleotides) using whole genome sequencing data. VNTRseek maps sequencing reads to a set of reference TRs and then identifies putative VNTRs based on a discrepancy between the copy number of a reference and its mapped reads. VNTRseek was used to analyze the Watson and Khoisan genomes (454 technology) and two 1000 Genomes family trios (Illumina). In the Watson genome, we identified 752 VNTRs with pattern sizes ranging from 7 to 84 nt. In the Khoisan genome, we identified 2572 VNTRs with pattern sizes ranging from 7 to 105 nt. In the trios, we identified between 2660 and 3822 VNTRs per individual and found nearly 100% consistency with Mendelian inheritance. VNTRseek is, to the best of our knowledge, the first software for genome-wide detection of minisatellite VNTRs. It is available at http://orca.bu.edu/vntrseek/. Oxford University Press 2014-08-18 2014-07-23 /pmc/articles/PMC4132751/ /pubmed/25056320 http://dx.doi.org/10.1093/nar/gku642 Text en © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Computational Biology Gelfand, Yevgeniy Hernandez, Yozen Loving, Joshua Benson, Gary VNTRseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data |
title | VNTRseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data |
title_full | VNTRseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data |
title_fullStr | VNTRseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data |
title_full_unstemmed | VNTRseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data |
title_short | VNTRseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data |
title_sort | vntrseek—a computational tool to detect tandem repeat variants in high-throughput sequencing data |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4132751/ https://www.ncbi.nlm.nih.gov/pubmed/25056320 http://dx.doi.org/10.1093/nar/gku642 |
work_keys_str_mv | AT gelfandyevgeniy vntrseekacomputationaltooltodetecttandemrepeatvariantsinhighthroughputsequencingdata AT hernandezyozen vntrseekacomputationaltooltodetecttandemrepeatvariantsinhighthroughputsequencingdata AT lovingjoshua vntrseekacomputationaltooltodetecttandemrepeatvariantsinhighthroughputsequencingdata AT bensongary vntrseekacomputationaltooltodetecttandemrepeatvariantsinhighthroughputsequencingdata |