Cargando…
Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.
Background: Advancements in DNA sequencing technology have transformed the field of bacterial genomics, allowing for faster and more cost effective chromosome level assemblies compared to a decade ago. However, transforming raw reads into a complete genome model is a significant computational challe...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10646344/ https://www.ncbi.nlm.nih.gov/pubmed/37970066 http://dx.doi.org/10.12688/f1000research.139488.1 |
_version_ | 1785134877001121792 |
---|---|
author | de Almeida, Felipe Marques de Campos, Tatiana Amabile Pappas Jr, Georgios Joannis |
author_facet | de Almeida, Felipe Marques de Campos, Tatiana Amabile Pappas Jr, Georgios Joannis |
author_sort | de Almeida, Felipe Marques |
collection | PubMed |
description | Background: Advancements in DNA sequencing technology have transformed the field of bacterial genomics, allowing for faster and more cost effective chromosome level assemblies compared to a decade ago. However, transforming raw reads into a complete genome model is a significant computational challenge due to the varying quality and quantity of data obtained from different sequencing instruments, as well as intrinsic characteristics of the genome and desired analyses. To address this issue, we have developed a set of container-based pipelines using Nextflow, offering both common workflows for inexperienced users and high levels of customization for experienced ones. Their processing strategies are adaptable based on the sequencing data type, and their modularity enables the incorporation of new components to address the community’s evolving needs. Methods: These pipelines consist of three parts: quality control, de novo genome assembly, and bacterial genome annotation. In particular, the genome annotation pipeline provides a comprehensive overview of the genome, including standard gene prediction and functional inference, as well as predictions relevant to clinical applications such as virulence and resistance gene annotation, secondary metabolite detection, prophage and plasmid prediction, and more. Results: The annotation results are presented in reports, genome browsers, and a web-based application that enables users to explore and interact with the genome annotation results. Conclusions: Overall, our user-friendly pipelines offer a seamless integration of computational tools to facilitate routine bacterial genomics research. The effectiveness of these is illustrated by examining the sequencing data of a clinical sample of Klebsiella pneumoniae. |
format | Online Article Text |
id | pubmed-10646344 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-106463442023-09-25 Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. de Almeida, Felipe Marques de Campos, Tatiana Amabile Pappas Jr, Georgios Joannis F1000Res Software Tool Article Background: Advancements in DNA sequencing technology have transformed the field of bacterial genomics, allowing for faster and more cost effective chromosome level assemblies compared to a decade ago. However, transforming raw reads into a complete genome model is a significant computational challenge due to the varying quality and quantity of data obtained from different sequencing instruments, as well as intrinsic characteristics of the genome and desired analyses. To address this issue, we have developed a set of container-based pipelines using Nextflow, offering both common workflows for inexperienced users and high levels of customization for experienced ones. Their processing strategies are adaptable based on the sequencing data type, and their modularity enables the incorporation of new components to address the community’s evolving needs. Methods: These pipelines consist of three parts: quality control, de novo genome assembly, and bacterial genome annotation. In particular, the genome annotation pipeline provides a comprehensive overview of the genome, including standard gene prediction and functional inference, as well as predictions relevant to clinical applications such as virulence and resistance gene annotation, secondary metabolite detection, prophage and plasmid prediction, and more. Results: The annotation results are presented in reports, genome browsers, and a web-based application that enables users to explore and interact with the genome annotation results. Conclusions: Overall, our user-friendly pipelines offer a seamless integration of computational tools to facilitate routine bacterial genomics research. The effectiveness of these is illustrated by examining the sequencing data of a clinical sample of Klebsiella pneumoniae. F1000 Research Limited 2023-09-25 /pmc/articles/PMC10646344/ /pubmed/37970066 http://dx.doi.org/10.12688/f1000research.139488.1 Text en Copyright: © 2023 Almeida FMd et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Tool Article de Almeida, Felipe Marques de Campos, Tatiana Amabile Pappas Jr, Georgios Joannis Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. |
title | Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. |
title_full | Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. |
title_fullStr | Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. |
title_full_unstemmed | Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. |
title_short | Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. |
title_sort | scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. |
topic | Software Tool Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10646344/ https://www.ncbi.nlm.nih.gov/pubmed/37970066 http://dx.doi.org/10.12688/f1000research.139488.1 |
work_keys_str_mv | AT dealmeidafelipemarques scalableandversatilecontainerbasedpipelinesfordenovogenomeassemblyandbacterialannotation AT decampostatianaamabile scalableandversatilecontainerbasedpipelinesfordenovogenomeassemblyandbacterialannotation AT pappasjrgeorgiosjoannis scalableandversatilecontainerbasedpipelinesfordenovogenomeassemblyandbacterialannotation |