Cargando…

Targeted Assembly of Short Sequence Reads

As next-generation sequence (NGS) production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS...

Descripción completa

Detalles Bibliográficos
Autores principales: Warren, René L., Holt, Robert A.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3092772/
https://www.ncbi.nlm.nih.gov/pubmed/21589938
http://dx.doi.org/10.1371/journal.pone.0019816
_version_ 1782203404612796416
author Warren, René L.
Holt, Robert A.
author_facet Warren, René L.
Holt, Robert A.
author_sort Warren, René L.
collection PubMed
description As next-generation sequence (NGS) production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled stringently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming genomic mutations, polymorphisms, fusions and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly.
format Text
id pubmed-3092772
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-30927722011-05-17 Targeted Assembly of Short Sequence Reads Warren, René L. Holt, Robert A. PLoS One Research Article As next-generation sequence (NGS) production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled stringently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming genomic mutations, polymorphisms, fusions and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly. Public Library of Science 2011-05-11 /pmc/articles/PMC3092772/ /pubmed/21589938 http://dx.doi.org/10.1371/journal.pone.0019816 Text en Warren, Holt. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Warren, René L.
Holt, Robert A.
Targeted Assembly of Short Sequence Reads
title Targeted Assembly of Short Sequence Reads
title_full Targeted Assembly of Short Sequence Reads
title_fullStr Targeted Assembly of Short Sequence Reads
title_full_unstemmed Targeted Assembly of Short Sequence Reads
title_short Targeted Assembly of Short Sequence Reads
title_sort targeted assembly of short sequence reads
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3092772/
https://www.ncbi.nlm.nih.gov/pubmed/21589938
http://dx.doi.org/10.1371/journal.pone.0019816
work_keys_str_mv AT warrenrenel targetedassemblyofshortsequencereads
AT holtroberta targetedassemblyofshortsequencereads