Cargando…

DeePVP: Identification and classification of phage virion proteins using deep learning

BACKGROUND: Many biological properties of phages are determined by phage virion proteins (PVPs), and the poor annotation of PVPs is a bottleneck for many areas of viral research, such as viral phylogenetic analysis, viral host identification, and antibacterial drug design. Because of the high divers...

Descripción completa

Detalles Bibliográficos
Autores principales: Fang, Zhencheng, Feng, Tao, Zhou, Hongwei, Chen, Muxuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9366990/
https://www.ncbi.nlm.nih.gov/pubmed/35950840
http://dx.doi.org/10.1093/gigascience/giac076
_version_ 1784765690074365952
author Fang, Zhencheng
Feng, Tao
Zhou, Hongwei
Chen, Muxuan
author_facet Fang, Zhencheng
Feng, Tao
Zhou, Hongwei
Chen, Muxuan
author_sort Fang, Zhencheng
collection PubMed
description BACKGROUND: Many biological properties of phages are determined by phage virion proteins (PVPs), and the poor annotation of PVPs is a bottleneck for many areas of viral research, such as viral phylogenetic analysis, viral host identification, and antibacterial drug design. Because of the high diversity of PVP sequences, the PVP annotation of a phage genome remains a particularly challenging bioinformatic task. FINDINGS: Based on deep learning, we developed DeePVP. The main module of DeePVP aims to discriminate PVPs from non-PVPs within a phage genome, while the extended module of DeePVP can further classify predicted PVPs into the 10 major classes of PVPs. Compared with the present state-of-the-art tools, the main module of DeePVP performs better, with a 9.05% higher F1-score in the PVP identification task. Moreover, the overall accuracy of the extended module of DeePVP in the PVP classification task is approximately 3.72% higher than that of PhANNs. Two application cases show that the predictions of DeePVP are more reliable and can better reveal the compact PVP-enriched region than the current state-of-the-art tools. Particularly, in the Escherichia phage phiEC1 genome, a novel PVP-enriched region that is conserved in many other Escherichia phage genomes was identified, indicating that DeePVP will be a useful tool for the analysis of phage genomic structures. CONCLUSIONS: DeePVP outperforms state-of-the-art tools. The program is optimized in both a virtual machine with graphical user interface and a docker so that the tool can be easily run by noncomputer professionals. DeePVP is freely available at https://github.com/fangzcbio/DeePVP/.
format Online
Article
Text
id pubmed-9366990
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-93669902022-08-12 DeePVP: Identification and classification of phage virion proteins using deep learning Fang, Zhencheng Feng, Tao Zhou, Hongwei Chen, Muxuan Gigascience Technical Note BACKGROUND: Many biological properties of phages are determined by phage virion proteins (PVPs), and the poor annotation of PVPs is a bottleneck for many areas of viral research, such as viral phylogenetic analysis, viral host identification, and antibacterial drug design. Because of the high diversity of PVP sequences, the PVP annotation of a phage genome remains a particularly challenging bioinformatic task. FINDINGS: Based on deep learning, we developed DeePVP. The main module of DeePVP aims to discriminate PVPs from non-PVPs within a phage genome, while the extended module of DeePVP can further classify predicted PVPs into the 10 major classes of PVPs. Compared with the present state-of-the-art tools, the main module of DeePVP performs better, with a 9.05% higher F1-score in the PVP identification task. Moreover, the overall accuracy of the extended module of DeePVP in the PVP classification task is approximately 3.72% higher than that of PhANNs. Two application cases show that the predictions of DeePVP are more reliable and can better reveal the compact PVP-enriched region than the current state-of-the-art tools. Particularly, in the Escherichia phage phiEC1 genome, a novel PVP-enriched region that is conserved in many other Escherichia phage genomes was identified, indicating that DeePVP will be a useful tool for the analysis of phage genomic structures. CONCLUSIONS: DeePVP outperforms state-of-the-art tools. The program is optimized in both a virtual machine with graphical user interface and a docker so that the tool can be easily run by noncomputer professionals. DeePVP is freely available at https://github.com/fangzcbio/DeePVP/. Oxford University Press 2022-08-11 /pmc/articles/PMC9366990/ /pubmed/35950840 http://dx.doi.org/10.1093/gigascience/giac076 Text en © The Author(s) 2022. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Fang, Zhencheng
Feng, Tao
Zhou, Hongwei
Chen, Muxuan
DeePVP: Identification and classification of phage virion proteins using deep learning
title DeePVP: Identification and classification of phage virion proteins using deep learning
title_full DeePVP: Identification and classification of phage virion proteins using deep learning
title_fullStr DeePVP: Identification and classification of phage virion proteins using deep learning
title_full_unstemmed DeePVP: Identification and classification of phage virion proteins using deep learning
title_short DeePVP: Identification and classification of phage virion proteins using deep learning
title_sort deepvp: identification and classification of phage virion proteins using deep learning
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9366990/
https://www.ncbi.nlm.nih.gov/pubmed/35950840
http://dx.doi.org/10.1093/gigascience/giac076
work_keys_str_mv AT fangzhencheng deepvpidentificationandclassificationofphagevirionproteinsusingdeeplearning
AT fengtao deepvpidentificationandclassificationofphagevirionproteinsusingdeeplearning
AT zhouhongwei deepvpidentificationandclassificationofphagevirionproteinsusingdeeplearning
AT chenmuxuan deepvpidentificationandclassificationofphagevirionproteinsusingdeeplearning