Cargando…

lobSTR: A short tandem repeat profiler for personal genomes

Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic genealogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. Thes...

Descripción completa

Detalles Bibliográficos
Autores principales: Gymrek, Melissa, Golan, David, Rosset, Saharon, Erlich, Yaniv
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371701/
https://www.ncbi.nlm.nih.gov/pubmed/22522390
http://dx.doi.org/10.1101/gr.135780.111
_version_ 1782235241885204480
author Gymrek, Melissa
Golan, David
Rosset, Saharon
Erlich, Yaniv
author_facet Gymrek, Melissa
Golan, David
Rosset, Saharon
Erlich, Yaniv
author_sort Gymrek, Melissa
collection PubMed
description Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic genealogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat STR mapping as gapped alignment, which results in cumbersome processing times and a biased sampling of STR alleles. Here, we present lobSTR, a novel method for profiling STRs in personal genomes. lobSTR harnesses concepts from signal processing and statistical learning to avoid gapped alignment and to address the specific noise patterns in STR calling. The speed and reliability of lobSTR exceed the performance of current mainstream algorithms for STR profiling. We validated lobSTR's accuracy by measuring its consistency in calling STRs from whole-genome sequencing of two biological replicates from the same individual, by tracing Mendelian inheritance patterns in STR alleles in whole-genome sequencing of a HapMap trio, and by comparing lobSTR results to traditional molecular techniques. Encouraged by the speed and accuracy of lobSTR, we used the algorithm to conduct a comprehensive survey of STR variations in a deeply sequenced personal genome. We traced the mutation dynamics of close to 100,000 STR loci and observed more than 50,000 STR variations in a single genome. lobSTR's implementation is an end-to-end solution. The package accepts raw sequencing reads and provides the user with the genotyping results. It is written in C/C++, includes multi-threading capabilities, and is compatible with the BAM format.
format Online
Article
Text
id pubmed-3371701
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-33717012012-12-01 lobSTR: A short tandem repeat profiler for personal genomes Gymrek, Melissa Golan, David Rosset, Saharon Erlich, Yaniv Genome Res Method Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic genealogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat STR mapping as gapped alignment, which results in cumbersome processing times and a biased sampling of STR alleles. Here, we present lobSTR, a novel method for profiling STRs in personal genomes. lobSTR harnesses concepts from signal processing and statistical learning to avoid gapped alignment and to address the specific noise patterns in STR calling. The speed and reliability of lobSTR exceed the performance of current mainstream algorithms for STR profiling. We validated lobSTR's accuracy by measuring its consistency in calling STRs from whole-genome sequencing of two biological replicates from the same individual, by tracing Mendelian inheritance patterns in STR alleles in whole-genome sequencing of a HapMap trio, and by comparing lobSTR results to traditional molecular techniques. Encouraged by the speed and accuracy of lobSTR, we used the algorithm to conduct a comprehensive survey of STR variations in a deeply sequenced personal genome. We traced the mutation dynamics of close to 100,000 STR loci and observed more than 50,000 STR variations in a single genome. lobSTR's implementation is an end-to-end solution. The package accepts raw sequencing reads and provides the user with the genotyping results. It is written in C/C++, includes multi-threading capabilities, and is compatible with the BAM format. Cold Spring Harbor Laboratory Press 2012-06 /pmc/articles/PMC3371701/ /pubmed/22522390 http://dx.doi.org/10.1101/gr.135780.111 Text en © 2012, Published by Cold Spring Harbor Laboratory Press This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Method
Gymrek, Melissa
Golan, David
Rosset, Saharon
Erlich, Yaniv
lobSTR: A short tandem repeat profiler for personal genomes
title lobSTR: A short tandem repeat profiler for personal genomes
title_full lobSTR: A short tandem repeat profiler for personal genomes
title_fullStr lobSTR: A short tandem repeat profiler for personal genomes
title_full_unstemmed lobSTR: A short tandem repeat profiler for personal genomes
title_short lobSTR: A short tandem repeat profiler for personal genomes
title_sort lobstr: a short tandem repeat profiler for personal genomes
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371701/
https://www.ncbi.nlm.nih.gov/pubmed/22522390
http://dx.doi.org/10.1101/gr.135780.111
work_keys_str_mv AT gymrekmelissa lobstrashorttandemrepeatprofilerforpersonalgenomes
AT golandavid lobstrashorttandemrepeatprofilerforpersonalgenomes
AT rossetsaharon lobstrashorttandemrepeatprofilerforpersonalgenomes
AT erlichyaniv lobstrashorttandemrepeatprofilerforpersonalgenomes