Cargando…

A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters

[Image: see text] Research in natural products, the genetically encoded small molecules produced by organisms in an idiosyncratic fashion, deals with molecular structure, biosynthesis, and biological activity. Bioinformatics analyses of microbial genomes can successfully reveal the genetic instructi...

Descripción completa

Detalles Bibliográficos
Autores principales: Walker, Allison S., Clardy, Jon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8243324/
https://www.ncbi.nlm.nih.gov/pubmed/34042443
http://dx.doi.org/10.1021/acs.jcim.0c01304
_version_ 1783715737824657408
author Walker, Allison S.
Clardy, Jon
author_facet Walker, Allison S.
Clardy, Jon
author_sort Walker, Allison S.
collection PubMed
description [Image: see text] Research in natural products, the genetically encoded small molecules produced by organisms in an idiosyncratic fashion, deals with molecular structure, biosynthesis, and biological activity. Bioinformatics analyses of microbial genomes can successfully reveal the genetic instructions, biosynthetic gene clusters, that produce many natural products. Genes to molecule predictions made on biosynthetic gene clusters have revealed many important new structures. There is no comparable method for genes to biological activity predictions. To address this missing pathway, we developed a machine learning bioinformatics method for predicting a natural product’s antibiotic activity directly from the sequence of its biosynthetic gene cluster. We trained commonly used machine learning classifiers to predict antibacterial or antifungal activity based on features of known natural product biosynthetic gene clusters. We have identified classifiers that can attain accuracies as high as 80% and that have enabled the identification of biosynthetic enzymes and their corresponding molecular features that are associated with antibiotic activity.
format Online
Article
Text
id pubmed-8243324
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-82433242021-07-06 A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters Walker, Allison S. Clardy, Jon J Chem Inf Model [Image: see text] Research in natural products, the genetically encoded small molecules produced by organisms in an idiosyncratic fashion, deals with molecular structure, biosynthesis, and biological activity. Bioinformatics analyses of microbial genomes can successfully reveal the genetic instructions, biosynthetic gene clusters, that produce many natural products. Genes to molecule predictions made on biosynthetic gene clusters have revealed many important new structures. There is no comparable method for genes to biological activity predictions. To address this missing pathway, we developed a machine learning bioinformatics method for predicting a natural product’s antibiotic activity directly from the sequence of its biosynthetic gene cluster. We trained commonly used machine learning classifiers to predict antibacterial or antifungal activity based on features of known natural product biosynthetic gene clusters. We have identified classifiers that can attain accuracies as high as 80% and that have enabled the identification of biosynthetic enzymes and their corresponding molecular features that are associated with antibiotic activity. American Chemical Society 2021-05-27 2021-06-28 /pmc/articles/PMC8243324/ /pubmed/34042443 http://dx.doi.org/10.1021/acs.jcim.0c01304 Text en © 2021 The Authors. Published by American Chemical Society Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Walker, Allison S.
Clardy, Jon
A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters
title A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters
title_full A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters
title_fullStr A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters
title_full_unstemmed A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters
title_short A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters
title_sort machine learning bioinformatics method to predict biological activity from biosynthetic gene clusters
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8243324/
https://www.ncbi.nlm.nih.gov/pubmed/34042443
http://dx.doi.org/10.1021/acs.jcim.0c01304
work_keys_str_mv AT walkerallisons amachinelearningbioinformaticsmethodtopredictbiologicalactivityfrombiosyntheticgeneclusters
AT clardyjon amachinelearningbioinformaticsmethodtopredictbiologicalactivityfrombiosyntheticgeneclusters
AT walkerallisons machinelearningbioinformaticsmethodtopredictbiologicalactivityfrombiosyntheticgeneclusters
AT clardyjon machinelearningbioinformaticsmethodtopredictbiologicalactivityfrombiosyntheticgeneclusters