Cargando…

SIS: a program to generate draft genome sequence scaffolds for prokaryotes

BACKGROUND: Decreasing costs of DNA sequencing have made prokaryotic draft genome sequences increasingly common. A contig scaffold is an ordering of contigs in the correct orientation. A scaffold can help genome comparisons and guide gap closure efforts. One popular technique for obtaining contig sc...

Descripción completa

Detalles Bibliográficos
Autores principales: Dias, Zanoni, Dias, Ulisses, Setubal, João C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3674793/
https://www.ncbi.nlm.nih.gov/pubmed/22583530
http://dx.doi.org/10.1186/1471-2105-13-96
_version_ 1782272419381116928
author Dias, Zanoni
Dias, Ulisses
Setubal, João C
author_facet Dias, Zanoni
Dias, Ulisses
Setubal, João C
author_sort Dias, Zanoni
collection PubMed
description BACKGROUND: Decreasing costs of DNA sequencing have made prokaryotic draft genome sequences increasingly common. A contig scaffold is an ordering of contigs in the correct orientation. A scaffold can help genome comparisons and guide gap closure efforts. One popular technique for obtaining contig scaffolds is to map contigs onto a reference genome. However, rearrangements that may exist between the query and reference genomes may result in incorrect scaffolds, if these rearrangements are not taken into account. Large-scale inversions are common rearrangement events in prokaryotic genomes. Even in draft genomes it is possible to detect the presence of inversions given sufficient sequencing coverage and a sufficiently close reference genome. RESULTS: We present a linear-time algorithm that can generate a set of contig scaffolds for a draft genome sequence represented in contigs given a reference genome. The algorithm is aimed at prokaryotic genomes and relies on the presence of matching sequence patterns between the query and reference genomes that can be interpreted as the result of large-scale inversions; we call these patterns inversion signatures. Our algorithm is capable of correctly generating a scaffold if at least one member of every inversion signature pair is present in contigs and no inversion signatures have been overwritten in evolution. The algorithm is also capable of generating scaffolds in the presence of any kind of inversion, even though in this general case there is no guarantee that all scaffolds in the scaffold set will be correct. We compare the performance of sis, the program that implements the algorithm, to seven other scaffold-generating programs. The results of our tests show that sis has overall better performance. CONCLUSIONS: sis is a new easy-to-use tool to generate contig scaffolds, available both as stand-alone and as a web server. The good performance of sis in our tests adds evidence that large-scale inversions are widespread in prokaryotic genomes.
format Online
Article
Text
id pubmed-3674793
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36747932013-06-10 SIS: a program to generate draft genome sequence scaffolds for prokaryotes Dias, Zanoni Dias, Ulisses Setubal, João C BMC Bioinformatics Research Article BACKGROUND: Decreasing costs of DNA sequencing have made prokaryotic draft genome sequences increasingly common. A contig scaffold is an ordering of contigs in the correct orientation. A scaffold can help genome comparisons and guide gap closure efforts. One popular technique for obtaining contig scaffolds is to map contigs onto a reference genome. However, rearrangements that may exist between the query and reference genomes may result in incorrect scaffolds, if these rearrangements are not taken into account. Large-scale inversions are common rearrangement events in prokaryotic genomes. Even in draft genomes it is possible to detect the presence of inversions given sufficient sequencing coverage and a sufficiently close reference genome. RESULTS: We present a linear-time algorithm that can generate a set of contig scaffolds for a draft genome sequence represented in contigs given a reference genome. The algorithm is aimed at prokaryotic genomes and relies on the presence of matching sequence patterns between the query and reference genomes that can be interpreted as the result of large-scale inversions; we call these patterns inversion signatures. Our algorithm is capable of correctly generating a scaffold if at least one member of every inversion signature pair is present in contigs and no inversion signatures have been overwritten in evolution. The algorithm is also capable of generating scaffolds in the presence of any kind of inversion, even though in this general case there is no guarantee that all scaffolds in the scaffold set will be correct. We compare the performance of sis, the program that implements the algorithm, to seven other scaffold-generating programs. The results of our tests show that sis has overall better performance. CONCLUSIONS: sis is a new easy-to-use tool to generate contig scaffolds, available both as stand-alone and as a web server. The good performance of sis in our tests adds evidence that large-scale inversions are widespread in prokaryotic genomes. BioMed Central 2012-05-14 /pmc/articles/PMC3674793/ /pubmed/22583530 http://dx.doi.org/10.1186/1471-2105-13-96 Text en Copyright © 2012 Dias et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Dias, Zanoni
Dias, Ulisses
Setubal, João C
SIS: a program to generate draft genome sequence scaffolds for prokaryotes
title SIS: a program to generate draft genome sequence scaffolds for prokaryotes
title_full SIS: a program to generate draft genome sequence scaffolds for prokaryotes
title_fullStr SIS: a program to generate draft genome sequence scaffolds for prokaryotes
title_full_unstemmed SIS: a program to generate draft genome sequence scaffolds for prokaryotes
title_short SIS: a program to generate draft genome sequence scaffolds for prokaryotes
title_sort sis: a program to generate draft genome sequence scaffolds for prokaryotes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3674793/
https://www.ncbi.nlm.nih.gov/pubmed/22583530
http://dx.doi.org/10.1186/1471-2105-13-96
work_keys_str_mv AT diaszanoni sisaprogramtogeneratedraftgenomesequencescaffoldsforprokaryotes
AT diasulisses sisaprogramtogeneratedraftgenomesequencescaffoldsforprokaryotes
AT setubaljoaoc sisaprogramtogeneratedraftgenomesequencescaffoldsforprokaryotes