Cargando…

BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs

Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools...

Descripción completa

Detalles Bibliográficos
Autores principales: Meleshko, Dmitry, Mohimani, Hosein, Tracanna, Vittorio, Hajirasouliha, Iman, Medema, Marnix H., Korobeynikov, Anton, Pevzner, Pavel A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6673720/
https://www.ncbi.nlm.nih.gov/pubmed/31160374
http://dx.doi.org/10.1101/gr.243477.118
_version_ 1783440596365475840
author Meleshko, Dmitry
Mohimani, Hosein
Tracanna, Vittorio
Hajirasouliha, Iman
Medema, Marnix H.
Korobeynikov, Anton
Pevzner, Pavel A.
author_facet Meleshko, Dmitry
Mohimani, Hosein
Tracanna, Vittorio
Hajirasouliha, Iman
Medema, Marnix H.
Korobeynikov, Anton
Pevzner, Pavel A.
author_sort Meleshko, Dmitry
collection PubMed
description Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools often assume that each BGC is encoded within a single contig in the genome assembly, a condition that is violated for most sequenced microbial genomes where BGCs are often scattered through several contigs, making it difficult to reconstruct them. The situation is even more severe in shotgun metagenomics, where the contigs are often short, and the existing tools fail to predict a large fraction of long BGCs. While it is difficult to assemble BGCs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding long BGCs. We describe biosyntheticSPAdes, a tool for predicting BGCs in assembly graphs and demonstrate that it greatly improves the reconstruction of BGCs from genomic and metagenomics data sets.
format Online
Article
Text
id pubmed-6673720
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-66737202020-02-01 BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs Meleshko, Dmitry Mohimani, Hosein Tracanna, Vittorio Hajirasouliha, Iman Medema, Marnix H. Korobeynikov, Anton Pevzner, Pavel A. Genome Res Method Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools often assume that each BGC is encoded within a single contig in the genome assembly, a condition that is violated for most sequenced microbial genomes where BGCs are often scattered through several contigs, making it difficult to reconstruct them. The situation is even more severe in shotgun metagenomics, where the contigs are often short, and the existing tools fail to predict a large fraction of long BGCs. While it is difficult to assemble BGCs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding long BGCs. We describe biosyntheticSPAdes, a tool for predicting BGCs in assembly graphs and demonstrate that it greatly improves the reconstruction of BGCs from genomic and metagenomics data sets. Cold Spring Harbor Laboratory Press 2019-08 /pmc/articles/PMC6673720/ /pubmed/31160374 http://dx.doi.org/10.1101/gr.243477.118 Text en © 2019 Meleshko et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Meleshko, Dmitry
Mohimani, Hosein
Tracanna, Vittorio
Hajirasouliha, Iman
Medema, Marnix H.
Korobeynikov, Anton
Pevzner, Pavel A.
BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs
title BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs
title_full BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs
title_fullStr BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs
title_full_unstemmed BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs
title_short BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs
title_sort biosyntheticspades: reconstructing biosynthetic gene clusters from assembly graphs
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6673720/
https://www.ncbi.nlm.nih.gov/pubmed/31160374
http://dx.doi.org/10.1101/gr.243477.118
work_keys_str_mv AT meleshkodmitry biosyntheticspadesreconstructingbiosyntheticgeneclustersfromassemblygraphs
AT mohimanihosein biosyntheticspadesreconstructingbiosyntheticgeneclustersfromassemblygraphs
AT tracannavittorio biosyntheticspadesreconstructingbiosyntheticgeneclustersfromassemblygraphs
AT hajirasoulihaiman biosyntheticspadesreconstructingbiosyntheticgeneclustersfromassemblygraphs
AT medemamarnixh biosyntheticspadesreconstructingbiosyntheticgeneclustersfromassemblygraphs
AT korobeynikovanton biosyntheticspadesreconstructingbiosyntheticgeneclustersfromassemblygraphs
AT pevznerpavela biosyntheticspadesreconstructingbiosyntheticgeneclustersfromassemblygraphs