Cargando…

SASpector: analysis of missing genomic regions in draft genomes of prokaryotes

SUMMARY: Missing regions in short-read assemblies of prokaryote genomes are often attributed to biases in sequencing technologies and to repetitive elements, the former resulting in low sequencing coverage of certain loci and the latter to unresolved loops in the de novo assembly graph. We developed...

Descripción completa

Detalles Bibliográficos
Autores principales: Lood, Cédric, Correa Rojo, Alejandro, Sinar, Deniz, Verkinderen, Emma, Lavigne, Rob, van Noort, Vera
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9113259/
https://www.ncbi.nlm.nih.gov/pubmed/35561201
http://dx.doi.org/10.1093/bioinformatics/btac208
_version_ 1784709552342564864
author Lood, Cédric
Correa Rojo, Alejandro
Sinar, Deniz
Verkinderen, Emma
Lavigne, Rob
van Noort, Vera
author_facet Lood, Cédric
Correa Rojo, Alejandro
Sinar, Deniz
Verkinderen, Emma
Lavigne, Rob
van Noort, Vera
author_sort Lood, Cédric
collection PubMed
description SUMMARY: Missing regions in short-read assemblies of prokaryote genomes are often attributed to biases in sequencing technologies and to repetitive elements, the former resulting in low sequencing coverage of certain loci and the latter to unresolved loops in the de novo assembly graph. We developed SASpector, a command-line tool that compares short-read assemblies (draft genomes) to their corresponding closed assemblies and extracts missing regions to analyze them at the sequence and functional level. SASpector allows to benchmark the need for resolved genomes, can be integrated into pipelines to control the quality of assemblies, and could be used for comparative investigations of missingness in assemblies for which both short-read and long-read data are available in the public databases. AVAILABILITY AND IMPLEMENTATION: SASpector is available at https://github.com/LoGT-KULeuven/SASpector. The tool is implemented in Python3 and available through pip and Docker (0mician/saspector). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9113259
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91132592022-05-18 SASpector: analysis of missing genomic regions in draft genomes of prokaryotes Lood, Cédric Correa Rojo, Alejandro Sinar, Deniz Verkinderen, Emma Lavigne, Rob van Noort, Vera Bioinformatics Applications Notes SUMMARY: Missing regions in short-read assemblies of prokaryote genomes are often attributed to biases in sequencing technologies and to repetitive elements, the former resulting in low sequencing coverage of certain loci and the latter to unresolved loops in the de novo assembly graph. We developed SASpector, a command-line tool that compares short-read assemblies (draft genomes) to their corresponding closed assemblies and extracts missing regions to analyze them at the sequence and functional level. SASpector allows to benchmark the need for resolved genomes, can be integrated into pipelines to control the quality of assemblies, and could be used for comparative investigations of missingness in assemblies for which both short-read and long-read data are available in the public databases. AVAILABILITY AND IMPLEMENTATION: SASpector is available at https://github.com/LoGT-KULeuven/SASpector. The tool is implemented in Python3 and available through pip and Docker (0mician/saspector). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-04-06 /pmc/articles/PMC9113259/ /pubmed/35561201 http://dx.doi.org/10.1093/bioinformatics/btac208 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Lood, Cédric
Correa Rojo, Alejandro
Sinar, Deniz
Verkinderen, Emma
Lavigne, Rob
van Noort, Vera
SASpector: analysis of missing genomic regions in draft genomes of prokaryotes
title SASpector: analysis of missing genomic regions in draft genomes of prokaryotes
title_full SASpector: analysis of missing genomic regions in draft genomes of prokaryotes
title_fullStr SASpector: analysis of missing genomic regions in draft genomes of prokaryotes
title_full_unstemmed SASpector: analysis of missing genomic regions in draft genomes of prokaryotes
title_short SASpector: analysis of missing genomic regions in draft genomes of prokaryotes
title_sort saspector: analysis of missing genomic regions in draft genomes of prokaryotes
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9113259/
https://www.ncbi.nlm.nih.gov/pubmed/35561201
http://dx.doi.org/10.1093/bioinformatics/btac208
work_keys_str_mv AT loodcedric saspectoranalysisofmissinggenomicregionsindraftgenomesofprokaryotes
AT correarojoalejandro saspectoranalysisofmissinggenomicregionsindraftgenomesofprokaryotes
AT sinardeniz saspectoranalysisofmissinggenomicregionsindraftgenomesofprokaryotes
AT verkinderenemma saspectoranalysisofmissinggenomicregionsindraftgenomesofprokaryotes
AT lavignerob saspectoranalysisofmissinggenomicregionsindraftgenomesofprokaryotes
AT vannoortvera saspectoranalysisofmissinggenomicregionsindraftgenomesofprokaryotes