Cargando…

SASpector: analysis of missing genomic regions in draft genomes of prokaryotes

SUMMARY: Missing regions in short-read assemblies of prokaryote genomes are often attributed to biases in sequencing technologies and to repetitive elements, the former resulting in low sequencing coverage of certain loci and the latter to unresolved loops in the de novo assembly graph. We developed...

Descripción completa

Detalles Bibliográficos
Autores principales: Lood, Cédric, Correa Rojo, Alejandro, Sinar, Deniz, Verkinderen, Emma, Lavigne, Rob, van Noort, Vera
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9113259/
https://www.ncbi.nlm.nih.gov/pubmed/35561201
http://dx.doi.org/10.1093/bioinformatics/btac208
Descripción
Sumario:SUMMARY: Missing regions in short-read assemblies of prokaryote genomes are often attributed to biases in sequencing technologies and to repetitive elements, the former resulting in low sequencing coverage of certain loci and the latter to unresolved loops in the de novo assembly graph. We developed SASpector, a command-line tool that compares short-read assemblies (draft genomes) to their corresponding closed assemblies and extracts missing regions to analyze them at the sequence and functional level. SASpector allows to benchmark the need for resolved genomes, can be integrated into pipelines to control the quality of assemblies, and could be used for comparative investigations of missingness in assemblies for which both short-read and long-read data are available in the public databases. AVAILABILITY AND IMPLEMENTATION: SASpector is available at https://github.com/LoGT-KULeuven/SASpector. The tool is implemented in Python3 and available through pip and Docker (0mician/saspector). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.