Cargando…

slag: A program for seeded local assembly of genes in complex genomes

Although finished genomes have become more common, there is still a need for assemblies of individual genes or chromosomal regions when only unassembled reads are available. slag (Seeded Local Assembly of Genes) fulfils this need by performing iterative local assembly based on cycles of matching‐rea...

Descripción completa

Detalles Bibliográficos
Autores principales: Crane, Charles F., Nemacheck, Jill A., Subramanyam, Subhashree, Williams, Christie E., Goodwin, Stephen B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9303413/
https://www.ncbi.nlm.nih.gov/pubmed/34995394
http://dx.doi.org/10.1111/1755-0998.13580
_version_ 1784751858405867520
author Crane, Charles F.
Nemacheck, Jill A.
Subramanyam, Subhashree
Williams, Christie E.
Goodwin, Stephen B.
author_facet Crane, Charles F.
Nemacheck, Jill A.
Subramanyam, Subhashree
Williams, Christie E.
Goodwin, Stephen B.
author_sort Crane, Charles F.
collection PubMed
description Although finished genomes have become more common, there is still a need for assemblies of individual genes or chromosomal regions when only unassembled reads are available. slag (Seeded Local Assembly of Genes) fulfils this need by performing iterative local assembly based on cycles of matching‐read retrieval with blast and assembly with cap3, phrap, spades, canu or unicycler. The target sequence can be nucleotide or protein. Read fragmentation allows slag to use phrap or cap3 to assemble long reads at lower coverage (e.g., 5×) than is possible with canu or unicycler. In simple, nonrepetitive genomes, a slag assembly can cover a whole chromosome, but in complex genomes the growth of target‐matching contigs is limited as additional reads are consumed by consensus contigs consisting of repetitive elements. Apart from genomic complexity, contig length and correctness depend on read length and accuracy. With pyrosequencing or Illumina reads, slag‐assembled contigs are accurate enough to allow design of PCR primers, while contigs assembled from Oxford Nanopore or pre‐HiFi Pacific Biosciences long reads are generally only accurate enough to design baiting sequences for further targeted sequencing. In an application with real reads, slag successfully extended sequences for four wheat genes, which were verified by cloning and Sanger sequencing of overlapping amplicons. slag is a robust alternative to atram 2 for local assemblies, especially for read sets with less than 20× coverage. slag is freely available at https://github.com/cfcrane/SLAG.
format Online
Article
Text
id pubmed-9303413
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-93034132022-07-22 slag: A program for seeded local assembly of genes in complex genomes Crane, Charles F. Nemacheck, Jill A. Subramanyam, Subhashree Williams, Christie E. Goodwin, Stephen B. Mol Ecol Resour RESOURCE ARTICLES Although finished genomes have become more common, there is still a need for assemblies of individual genes or chromosomal regions when only unassembled reads are available. slag (Seeded Local Assembly of Genes) fulfils this need by performing iterative local assembly based on cycles of matching‐read retrieval with blast and assembly with cap3, phrap, spades, canu or unicycler. The target sequence can be nucleotide or protein. Read fragmentation allows slag to use phrap or cap3 to assemble long reads at lower coverage (e.g., 5×) than is possible with canu or unicycler. In simple, nonrepetitive genomes, a slag assembly can cover a whole chromosome, but in complex genomes the growth of target‐matching contigs is limited as additional reads are consumed by consensus contigs consisting of repetitive elements. Apart from genomic complexity, contig length and correctness depend on read length and accuracy. With pyrosequencing or Illumina reads, slag‐assembled contigs are accurate enough to allow design of PCR primers, while contigs assembled from Oxford Nanopore or pre‐HiFi Pacific Biosciences long reads are generally only accurate enough to design baiting sequences for further targeted sequencing. In an application with real reads, slag successfully extended sequences for four wheat genes, which were verified by cloning and Sanger sequencing of overlapping amplicons. slag is a robust alternative to atram 2 for local assemblies, especially for read sets with less than 20× coverage. slag is freely available at https://github.com/cfcrane/SLAG. John Wiley and Sons Inc. 2022-01-27 2022-07 /pmc/articles/PMC9303413/ /pubmed/34995394 http://dx.doi.org/10.1111/1755-0998.13580 Text en Published 2022. This article is a U.S.Government work and is in the public domain in the USA. Molecular Ecology Resources published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle RESOURCE ARTICLES
Crane, Charles F.
Nemacheck, Jill A.
Subramanyam, Subhashree
Williams, Christie E.
Goodwin, Stephen B.
slag: A program for seeded local assembly of genes in complex genomes
title slag: A program for seeded local assembly of genes in complex genomes
title_full slag: A program for seeded local assembly of genes in complex genomes
title_fullStr slag: A program for seeded local assembly of genes in complex genomes
title_full_unstemmed slag: A program for seeded local assembly of genes in complex genomes
title_short slag: A program for seeded local assembly of genes in complex genomes
title_sort slag: a program for seeded local assembly of genes in complex genomes
topic RESOURCE ARTICLES
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9303413/
https://www.ncbi.nlm.nih.gov/pubmed/34995394
http://dx.doi.org/10.1111/1755-0998.13580
work_keys_str_mv AT cranecharlesf slagaprogramforseededlocalassemblyofgenesincomplexgenomes
AT nemacheckjilla slagaprogramforseededlocalassemblyofgenesincomplexgenomes
AT subramanyamsubhashree slagaprogramforseededlocalassemblyofgenesincomplexgenomes
AT williamschristiee slagaprogramforseededlocalassemblyofgenesincomplexgenomes
AT goodwinstephenb slagaprogramforseededlocalassemblyofgenesincomplexgenomes