Cargando…
NeuRiPP: Neural network identification of RiPP precursor peptides
Significant progress has been made in the past few years on the computational identification of biosynthetic gene clusters (BGCs) that encode ribosomally synthesized and post-translationally modified peptides (RiPPs). This is done by identifying both RiPP tailoring enzymes (RTEs) and RiPP precursor...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6746993/ https://www.ncbi.nlm.nih.gov/pubmed/31527713 http://dx.doi.org/10.1038/s41598-019-49764-z |
_version_ | 1783451798975021056 |
---|---|
author | de los Santos, Emmanuel L. C. |
author_facet | de los Santos, Emmanuel L. C. |
author_sort | de los Santos, Emmanuel L. C. |
collection | PubMed |
description | Significant progress has been made in the past few years on the computational identification of biosynthetic gene clusters (BGCs) that encode ribosomally synthesized and post-translationally modified peptides (RiPPs). This is done by identifying both RiPP tailoring enzymes (RTEs) and RiPP precursor peptides (PPs). However, identification of PPs, particularly for novel RiPP classes remains challenging. To address this, machine learning has been used to accurately identify PP sequences. Current machine learning tools have limitations, since they are specific to the RiPPclass they are trained for and are context-dependent, requiring information about the surrounding genetic environment of the putative PP sequences. NeuRiPP overcomes these limitations. It does this by leveraging the rich data set of high-confidence putative PP sequences from existing programs, along with experimentally verified PPs from RiPP databases. NeuRiPP uses neural network archictectures that are suitable for peptide classification with weights trained on PP datasets. It is able to identify known PP sequences, and sequences that are likely PPs. When tested on existing RiPP BGC datasets, NeuRiPP was able to identify PP sequences in significantly more putative RiPP clusters than current tools while maintaining the same HMM hit accuracy. Finally, NeuRiPP was able to successfully identify PP sequences from novel RiPP classes that were recently characterized experimentally, highlighting its utility in complementing existing bioinformatics tools. |
format | Online Article Text |
id | pubmed-6746993 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-67469932019-09-27 NeuRiPP: Neural network identification of RiPP precursor peptides de los Santos, Emmanuel L. C. Sci Rep Article Significant progress has been made in the past few years on the computational identification of biosynthetic gene clusters (BGCs) that encode ribosomally synthesized and post-translationally modified peptides (RiPPs). This is done by identifying both RiPP tailoring enzymes (RTEs) and RiPP precursor peptides (PPs). However, identification of PPs, particularly for novel RiPP classes remains challenging. To address this, machine learning has been used to accurately identify PP sequences. Current machine learning tools have limitations, since they are specific to the RiPPclass they are trained for and are context-dependent, requiring information about the surrounding genetic environment of the putative PP sequences. NeuRiPP overcomes these limitations. It does this by leveraging the rich data set of high-confidence putative PP sequences from existing programs, along with experimentally verified PPs from RiPP databases. NeuRiPP uses neural network archictectures that are suitable for peptide classification with weights trained on PP datasets. It is able to identify known PP sequences, and sequences that are likely PPs. When tested on existing RiPP BGC datasets, NeuRiPP was able to identify PP sequences in significantly more putative RiPP clusters than current tools while maintaining the same HMM hit accuracy. Finally, NeuRiPP was able to successfully identify PP sequences from novel RiPP classes that were recently characterized experimentally, highlighting its utility in complementing existing bioinformatics tools. Nature Publishing Group UK 2019-09-16 /pmc/articles/PMC6746993/ /pubmed/31527713 http://dx.doi.org/10.1038/s41598-019-49764-z Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article de los Santos, Emmanuel L. C. NeuRiPP: Neural network identification of RiPP precursor peptides |
title | NeuRiPP: Neural network identification of RiPP precursor peptides |
title_full | NeuRiPP: Neural network identification of RiPP precursor peptides |
title_fullStr | NeuRiPP: Neural network identification of RiPP precursor peptides |
title_full_unstemmed | NeuRiPP: Neural network identification of RiPP precursor peptides |
title_short | NeuRiPP: Neural network identification of RiPP precursor peptides |
title_sort | neuripp: neural network identification of ripp precursor peptides |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6746993/ https://www.ncbi.nlm.nih.gov/pubmed/31527713 http://dx.doi.org/10.1038/s41598-019-49764-z |
work_keys_str_mv | AT delossantosemmanuellc neurippneuralnetworkidentificationofrippprecursorpeptides |