Cargando…
DeePVP: Identification and classification of phage virion proteins using deep learning
BACKGROUND: Many biological properties of phages are determined by phage virion proteins (PVPs), and the poor annotation of PVPs is a bottleneck for many areas of viral research, such as viral phylogenetic analysis, viral host identification, and antibacterial drug design. Because of the high divers...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9366990/ https://www.ncbi.nlm.nih.gov/pubmed/35950840 http://dx.doi.org/10.1093/gigascience/giac076 |
_version_ | 1784765690074365952 |
---|---|
author | Fang, Zhencheng Feng, Tao Zhou, Hongwei Chen, Muxuan |
author_facet | Fang, Zhencheng Feng, Tao Zhou, Hongwei Chen, Muxuan |
author_sort | Fang, Zhencheng |
collection | PubMed |
description | BACKGROUND: Many biological properties of phages are determined by phage virion proteins (PVPs), and the poor annotation of PVPs is a bottleneck for many areas of viral research, such as viral phylogenetic analysis, viral host identification, and antibacterial drug design. Because of the high diversity of PVP sequences, the PVP annotation of a phage genome remains a particularly challenging bioinformatic task. FINDINGS: Based on deep learning, we developed DeePVP. The main module of DeePVP aims to discriminate PVPs from non-PVPs within a phage genome, while the extended module of DeePVP can further classify predicted PVPs into the 10 major classes of PVPs. Compared with the present state-of-the-art tools, the main module of DeePVP performs better, with a 9.05% higher F1-score in the PVP identification task. Moreover, the overall accuracy of the extended module of DeePVP in the PVP classification task is approximately 3.72% higher than that of PhANNs. Two application cases show that the predictions of DeePVP are more reliable and can better reveal the compact PVP-enriched region than the current state-of-the-art tools. Particularly, in the Escherichia phage phiEC1 genome, a novel PVP-enriched region that is conserved in many other Escherichia phage genomes was identified, indicating that DeePVP will be a useful tool for the analysis of phage genomic structures. CONCLUSIONS: DeePVP outperforms state-of-the-art tools. The program is optimized in both a virtual machine with graphical user interface and a docker so that the tool can be easily run by noncomputer professionals. DeePVP is freely available at https://github.com/fangzcbio/DeePVP/. |
format | Online Article Text |
id | pubmed-9366990 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-93669902022-08-12 DeePVP: Identification and classification of phage virion proteins using deep learning Fang, Zhencheng Feng, Tao Zhou, Hongwei Chen, Muxuan Gigascience Technical Note BACKGROUND: Many biological properties of phages are determined by phage virion proteins (PVPs), and the poor annotation of PVPs is a bottleneck for many areas of viral research, such as viral phylogenetic analysis, viral host identification, and antibacterial drug design. Because of the high diversity of PVP sequences, the PVP annotation of a phage genome remains a particularly challenging bioinformatic task. FINDINGS: Based on deep learning, we developed DeePVP. The main module of DeePVP aims to discriminate PVPs from non-PVPs within a phage genome, while the extended module of DeePVP can further classify predicted PVPs into the 10 major classes of PVPs. Compared with the present state-of-the-art tools, the main module of DeePVP performs better, with a 9.05% higher F1-score in the PVP identification task. Moreover, the overall accuracy of the extended module of DeePVP in the PVP classification task is approximately 3.72% higher than that of PhANNs. Two application cases show that the predictions of DeePVP are more reliable and can better reveal the compact PVP-enriched region than the current state-of-the-art tools. Particularly, in the Escherichia phage phiEC1 genome, a novel PVP-enriched region that is conserved in many other Escherichia phage genomes was identified, indicating that DeePVP will be a useful tool for the analysis of phage genomic structures. CONCLUSIONS: DeePVP outperforms state-of-the-art tools. The program is optimized in both a virtual machine with graphical user interface and a docker so that the tool can be easily run by noncomputer professionals. DeePVP is freely available at https://github.com/fangzcbio/DeePVP/. Oxford University Press 2022-08-11 /pmc/articles/PMC9366990/ /pubmed/35950840 http://dx.doi.org/10.1093/gigascience/giac076 Text en © The Author(s) 2022. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Fang, Zhencheng Feng, Tao Zhou, Hongwei Chen, Muxuan DeePVP: Identification and classification of phage virion proteins using deep learning |
title | DeePVP: Identification and classification of phage virion proteins using deep learning |
title_full | DeePVP: Identification and classification of phage virion proteins using deep learning |
title_fullStr | DeePVP: Identification and classification of phage virion proteins using deep learning |
title_full_unstemmed | DeePVP: Identification and classification of phage virion proteins using deep learning |
title_short | DeePVP: Identification and classification of phage virion proteins using deep learning |
title_sort | deepvp: identification and classification of phage virion proteins using deep learning |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9366990/ https://www.ncbi.nlm.nih.gov/pubmed/35950840 http://dx.doi.org/10.1093/gigascience/giac076 |
work_keys_str_mv | AT fangzhencheng deepvpidentificationandclassificationofphagevirionproteinsusingdeeplearning AT fengtao deepvpidentificationandclassificationofphagevirionproteinsusingdeeplearning AT zhouhongwei deepvpidentificationandclassificationofphagevirionproteinsusingdeeplearning AT chenmuxuan deepvpidentificationandclassificationofphagevirionproteinsusingdeeplearning |