Cargando…

Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.

Background: Advancements in DNA sequencing technology have transformed the field of bacterial genomics, allowing for faster and more cost effective chromosome level assemblies compared to a decade ago. However, transforming raw reads into a complete genome model is a significant computational challe...

Descripción completa

Detalles Bibliográficos
Autores principales: de Almeida, Felipe Marques, de Campos, Tatiana Amabile, Pappas Jr, Georgios Joannis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10646344/
https://www.ncbi.nlm.nih.gov/pubmed/37970066
http://dx.doi.org/10.12688/f1000research.139488.1
_version_ 1785134877001121792
author de Almeida, Felipe Marques
de Campos, Tatiana Amabile
Pappas Jr, Georgios Joannis
author_facet de Almeida, Felipe Marques
de Campos, Tatiana Amabile
Pappas Jr, Georgios Joannis
author_sort de Almeida, Felipe Marques
collection PubMed
description Background: Advancements in DNA sequencing technology have transformed the field of bacterial genomics, allowing for faster and more cost effective chromosome level assemblies compared to a decade ago. However, transforming raw reads into a complete genome model is a significant computational challenge due to the varying quality and quantity of data obtained from different sequencing instruments, as well as intrinsic characteristics of the genome and desired analyses. To address this issue, we have developed a set of container-based pipelines using Nextflow, offering both common workflows for inexperienced users and high levels of customization for experienced ones. Their processing strategies are adaptable based on the sequencing data type, and their modularity enables the incorporation of new components to address the community’s evolving needs. Methods: These pipelines consist of three parts: quality control, de novo genome assembly, and bacterial genome annotation. In particular, the genome annotation pipeline provides a comprehensive overview of the genome, including standard gene prediction and functional inference, as well as predictions relevant to clinical applications such as virulence and resistance gene annotation, secondary metabolite detection, prophage and plasmid prediction, and more. Results: The annotation results are presented in reports, genome browsers, and a web-based application that enables users to explore and interact with the genome annotation results. Conclusions: Overall, our user-friendly pipelines offer a seamless integration of computational tools to facilitate routine bacterial genomics research. The effectiveness of these is illustrated by examining the sequencing data of a clinical sample of Klebsiella pneumoniae.
format Online
Article
Text
id pubmed-10646344
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-106463442023-09-25 Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. de Almeida, Felipe Marques de Campos, Tatiana Amabile Pappas Jr, Georgios Joannis F1000Res Software Tool Article Background: Advancements in DNA sequencing technology have transformed the field of bacterial genomics, allowing for faster and more cost effective chromosome level assemblies compared to a decade ago. However, transforming raw reads into a complete genome model is a significant computational challenge due to the varying quality and quantity of data obtained from different sequencing instruments, as well as intrinsic characteristics of the genome and desired analyses. To address this issue, we have developed a set of container-based pipelines using Nextflow, offering both common workflows for inexperienced users and high levels of customization for experienced ones. Their processing strategies are adaptable based on the sequencing data type, and their modularity enables the incorporation of new components to address the community’s evolving needs. Methods: These pipelines consist of three parts: quality control, de novo genome assembly, and bacterial genome annotation. In particular, the genome annotation pipeline provides a comprehensive overview of the genome, including standard gene prediction and functional inference, as well as predictions relevant to clinical applications such as virulence and resistance gene annotation, secondary metabolite detection, prophage and plasmid prediction, and more. Results: The annotation results are presented in reports, genome browsers, and a web-based application that enables users to explore and interact with the genome annotation results. Conclusions: Overall, our user-friendly pipelines offer a seamless integration of computational tools to facilitate routine bacterial genomics research. The effectiveness of these is illustrated by examining the sequencing data of a clinical sample of Klebsiella pneumoniae. F1000 Research Limited 2023-09-25 /pmc/articles/PMC10646344/ /pubmed/37970066 http://dx.doi.org/10.12688/f1000research.139488.1 Text en Copyright: © 2023 Almeida FMd et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
de Almeida, Felipe Marques
de Campos, Tatiana Amabile
Pappas Jr, Georgios Joannis
Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.
title Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.
title_full Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.
title_fullStr Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.
title_full_unstemmed Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.
title_short Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.
title_sort scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10646344/
https://www.ncbi.nlm.nih.gov/pubmed/37970066
http://dx.doi.org/10.12688/f1000research.139488.1
work_keys_str_mv AT dealmeidafelipemarques scalableandversatilecontainerbasedpipelinesfordenovogenomeassemblyandbacterialannotation
AT decampostatianaamabile scalableandversatilecontainerbasedpipelinesfordenovogenomeassemblyandbacterialannotation
AT pappasjrgeorgiosjoannis scalableandversatilecontainerbasedpipelinesfordenovogenomeassemblyandbacterialannotation