Cargando…

Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements

Non-ribosomal peptide synthetase (NRPS) is a diverse family of biosynthetic enzymes for the assembly of bioactive peptides. Despite advances in microbial sequencing, the lack of a consistent standard for annotating NRPS domains and modules has made data-driven discoveries challenging. To address thi...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Ruolin, Zhang, Jinyu, Shao, Yuanzhe, Gu, Shaohua, Song, Chen, Qian, Long, Yin, Wen-Bing, Li, Zhiyuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10212144/
https://www.ncbi.nlm.nih.gov/pubmed/37186644
http://dx.doi.org/10.1371/journal.pcbi.1011100
_version_ 1785047403620990976
author He, Ruolin
Zhang, Jinyu
Shao, Yuanzhe
Gu, Shaohua
Song, Chen
Qian, Long
Yin, Wen-Bing
Li, Zhiyuan
author_facet He, Ruolin
Zhang, Jinyu
Shao, Yuanzhe
Gu, Shaohua
Song, Chen
Qian, Long
Yin, Wen-Bing
Li, Zhiyuan
author_sort He, Ruolin
collection PubMed
description Non-ribosomal peptide synthetase (NRPS) is a diverse family of biosynthetic enzymes for the assembly of bioactive peptides. Despite advances in microbial sequencing, the lack of a consistent standard for annotating NRPS domains and modules has made data-driven discoveries challenging. To address this, we introduced a standardized architecture for NRPS, by using known conserved motifs to partition typical domains. This motif-and-intermotif standardization allowed for systematic evaluations of sequence properties from a large number of NRPS pathways, resulting in the most comprehensive cross-kingdom C domain subtype classifications to date, as well as the discovery and experimental validation of novel conserved motifs with functional significance. Furthermore, our coevolution analysis revealed important barriers associated with re-engineering NRPSs and uncovered the entanglement between phylogeny and substrate specificity in NRPS sequences. Our findings provide a comprehensive and statistically insightful analysis of NRPS sequences, opening avenues for future data-driven discoveries.
format Online
Article
Text
id pubmed-10212144
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-102121442023-05-26 Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements He, Ruolin Zhang, Jinyu Shao, Yuanzhe Gu, Shaohua Song, Chen Qian, Long Yin, Wen-Bing Li, Zhiyuan PLoS Comput Biol Research Article Non-ribosomal peptide synthetase (NRPS) is a diverse family of biosynthetic enzymes for the assembly of bioactive peptides. Despite advances in microbial sequencing, the lack of a consistent standard for annotating NRPS domains and modules has made data-driven discoveries challenging. To address this, we introduced a standardized architecture for NRPS, by using known conserved motifs to partition typical domains. This motif-and-intermotif standardization allowed for systematic evaluations of sequence properties from a large number of NRPS pathways, resulting in the most comprehensive cross-kingdom C domain subtype classifications to date, as well as the discovery and experimental validation of novel conserved motifs with functional significance. Furthermore, our coevolution analysis revealed important barriers associated with re-engineering NRPSs and uncovered the entanglement between phylogeny and substrate specificity in NRPS sequences. Our findings provide a comprehensive and statistically insightful analysis of NRPS sequences, opening avenues for future data-driven discoveries. Public Library of Science 2023-05-15 /pmc/articles/PMC10212144/ /pubmed/37186644 http://dx.doi.org/10.1371/journal.pcbi.1011100 Text en © 2023 He et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
He, Ruolin
Zhang, Jinyu
Shao, Yuanzhe
Gu, Shaohua
Song, Chen
Qian, Long
Yin, Wen-Bing
Li, Zhiyuan
Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements
title Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements
title_full Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements
title_fullStr Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements
title_full_unstemmed Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements
title_short Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements
title_sort knowledge-guided data mining on the standardized architecture of nrps: subtypes, novel motifs, and sequence entanglements
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10212144/
https://www.ncbi.nlm.nih.gov/pubmed/37186644
http://dx.doi.org/10.1371/journal.pcbi.1011100
work_keys_str_mv AT heruolin knowledgeguideddataminingonthestandardizedarchitectureofnrpssubtypesnovelmotifsandsequenceentanglements
AT zhangjinyu knowledgeguideddataminingonthestandardizedarchitectureofnrpssubtypesnovelmotifsandsequenceentanglements
AT shaoyuanzhe knowledgeguideddataminingonthestandardizedarchitectureofnrpssubtypesnovelmotifsandsequenceentanglements
AT gushaohua knowledgeguideddataminingonthestandardizedarchitectureofnrpssubtypesnovelmotifsandsequenceentanglements
AT songchen knowledgeguideddataminingonthestandardizedarchitectureofnrpssubtypesnovelmotifsandsequenceentanglements
AT qianlong knowledgeguideddataminingonthestandardizedarchitectureofnrpssubtypesnovelmotifsandsequenceentanglements
AT yinwenbing knowledgeguideddataminingonthestandardizedarchitectureofnrpssubtypesnovelmotifsandsequenceentanglements
AT lizhiyuan knowledgeguideddataminingonthestandardizedarchitectureofnrpssubtypesnovelmotifsandsequenceentanglements