Cargando…

Navigating and expanding the roadmap of natural product genome mining tools

Natural products are structurally highly diverse and exhibit a wide array of biological activities. As a result, they serve as an important source of new drug leads. Traditionally, natural products have been discovered by bioactivity-guided fractionation. The advent of genome sequencing technology h...

Descripción completa

Detalles Bibliográficos
Autores principales: Biermann, Friederike, Wenski, Sebastian L, Helfrich, Eric J N
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Beilstein-Institut 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9749553/
https://www.ncbi.nlm.nih.gov/pubmed/36570563
http://dx.doi.org/10.3762/bjoc.18.178
Descripción
Sumario:Natural products are structurally highly diverse and exhibit a wide array of biological activities. As a result, they serve as an important source of new drug leads. Traditionally, natural products have been discovered by bioactivity-guided fractionation. The advent of genome sequencing technology has resulted in the introduction of an alternative approach towards novel natural product scaffolds: Genome mining. Genome mining is an in-silico natural product discovery strategy in which sequenced genomes are analyzed for the potential of the associated organism to produce natural products. Seemingly universal biosynthetic principles have been deciphered for most natural product classes that are used to detect natural product biosynthetic gene clusters using pathway-encoded conserved key enzymes, domains, or motifs as bait. Several generations of highly sophisticated tools have been developed for the biosynthetic rule-based identification of natural product gene clusters. Apart from these hard-coded algorithms, multiple tools that use machine learning-based approaches have been designed to complement the existing genome mining tool set and focus on natural product gene clusters that lack genes with conserved signature sequences. In this perspective, we take a closer look at state-of-the-art genome mining tools that are based on either hard-coded rules or machine learning algorithms, with an emphasis on the confidence of their predictions and potential to identify non-canonical natural product biosynthetic gene clusters. We highlight the genome mining pipelines' current strengths and limitations by contrasting their advantages and disadvantages. Moreover, we introduce two indirect biosynthetic gene cluster identification strategies that complement current workflows. The combination of all genome mining approaches will pave the way towards a more comprehensive understanding of the full biosynthetic repertoire encoded in microbial genome sequences.