Cargando…
Identifying antimicrobial peptides using word embedding with deep recurrent neural networks
MOTIVATION: Antibiotic resistance constitutes a major public health crisis, and finding new sources of antimicrobial drugs is crucial to solving it. Bacteriocins, which are bacterially produced antimicrobial peptide products, are candidates for broadening the available choices of antimicrobials. How...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6581433/ https://www.ncbi.nlm.nih.gov/pubmed/30418485 http://dx.doi.org/10.1093/bioinformatics/bty937 |
_version_ | 1783428166226804736 |
---|---|
author | Hamid, Md-Nafiz Friedberg, Iddo |
author_facet | Hamid, Md-Nafiz Friedberg, Iddo |
author_sort | Hamid, Md-Nafiz |
collection | PubMed |
description | MOTIVATION: Antibiotic resistance constitutes a major public health crisis, and finding new sources of antimicrobial drugs is crucial to solving it. Bacteriocins, which are bacterially produced antimicrobial peptide products, are candidates for broadening the available choices of antimicrobials. However, the discovery of new bacteriocins by genomic mining is hampered by their sequences’ low complexity and high variance, which frustrates sequence similarity-based searches. RESULTS: Here we use word embeddings of protein sequences to represent bacteriocins, and apply a word embedding method that accounts for amino acid order in protein sequences, to predict novel bacteriocins from protein sequences without using sequence similarity. Our method predicts, with a high probability, six yet unknown putative bacteriocins in Lactobacillus. Generalized, the representation of sequences with word embeddings preserving sequence order information can be applied to peptide and protein classification problems for which sequence similarity cannot be used. AVAILABILITY AND IMPLEMENTATION: Data and source code for this project are freely available at: https://github.com/nafizh/NeuBI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-6581433 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-65814332019-06-21 Identifying antimicrobial peptides using word embedding with deep recurrent neural networks Hamid, Md-Nafiz Friedberg, Iddo Bioinformatics Original Papers MOTIVATION: Antibiotic resistance constitutes a major public health crisis, and finding new sources of antimicrobial drugs is crucial to solving it. Bacteriocins, which are bacterially produced antimicrobial peptide products, are candidates for broadening the available choices of antimicrobials. However, the discovery of new bacteriocins by genomic mining is hampered by their sequences’ low complexity and high variance, which frustrates sequence similarity-based searches. RESULTS: Here we use word embeddings of protein sequences to represent bacteriocins, and apply a word embedding method that accounts for amino acid order in protein sequences, to predict novel bacteriocins from protein sequences without using sequence similarity. Our method predicts, with a high probability, six yet unknown putative bacteriocins in Lactobacillus. Generalized, the representation of sequences with word embeddings preserving sequence order information can be applied to peptide and protein classification problems for which sequence similarity cannot be used. AVAILABILITY AND IMPLEMENTATION: Data and source code for this project are freely available at: https://github.com/nafizh/NeuBI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-06 2018-11-10 /pmc/articles/PMC6581433/ /pubmed/30418485 http://dx.doi.org/10.1093/bioinformatics/bty937 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Hamid, Md-Nafiz Friedberg, Iddo Identifying antimicrobial peptides using word embedding with deep recurrent neural networks |
title | Identifying antimicrobial peptides using word embedding with deep recurrent neural networks |
title_full | Identifying antimicrobial peptides using word embedding with deep recurrent neural networks |
title_fullStr | Identifying antimicrobial peptides using word embedding with deep recurrent neural networks |
title_full_unstemmed | Identifying antimicrobial peptides using word embedding with deep recurrent neural networks |
title_short | Identifying antimicrobial peptides using word embedding with deep recurrent neural networks |
title_sort | identifying antimicrobial peptides using word embedding with deep recurrent neural networks |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6581433/ https://www.ncbi.nlm.nih.gov/pubmed/30418485 http://dx.doi.org/10.1093/bioinformatics/bty937 |
work_keys_str_mv | AT hamidmdnafiz identifyingantimicrobialpeptidesusingwordembeddingwithdeeprecurrentneuralnetworks AT friedbergiddo identifyingantimicrobialpeptidesusingwordembeddingwithdeeprecurrentneuralnetworks |