Cargando…

Automated classification of tailed bacteriophages according to their neck organization

BACKGROUND: The genetic diversity observed among bacteriophages remains a major obstacle for the identification of homologs and the comparison of their functional modules. In the structural module, although several classes of homologous proteins contributing to the head and tail structure can be det...

Descripción completa

Detalles Bibliográficos
Autores principales: Lopes, Anne, Tavares, Paulo, Petit, Marie-Agnès, Guérois, Raphaël, Zinn-Justin, Sophie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4362835/
https://www.ncbi.nlm.nih.gov/pubmed/25428721
http://dx.doi.org/10.1186/1471-2164-15-1027
_version_ 1782361850705346560
author Lopes, Anne
Tavares, Paulo
Petit, Marie-Agnès
Guérois, Raphaël
Zinn-Justin, Sophie
author_facet Lopes, Anne
Tavares, Paulo
Petit, Marie-Agnès
Guérois, Raphaël
Zinn-Justin, Sophie
author_sort Lopes, Anne
collection PubMed
description BACKGROUND: The genetic diversity observed among bacteriophages remains a major obstacle for the identification of homologs and the comparison of their functional modules. In the structural module, although several classes of homologous proteins contributing to the head and tail structure can be detected, proteins of the head-to-tail connection (or neck) are generally more divergent. Yet, molecular analyses of a few tailed phages belonging to different morphological classes suggested that only a limited number of structural solutions are used in order to produce a functional virion. To challenge this hypothesis and analyze proteins diversity at the virion neck, we developed a specific computational strategy to cope with sequence divergence in phage proteins. We searched for homologs of a set of proteins encoded in the structural module using a phage learning database. RESULTS: We show that using a combination of iterative profile-profile comparison and gene context analyses, we can identify a set of head, neck and tail proteins in most tailed bacteriophages of our database. Classification of phages based on neck protein sequences delineates 4 Types corresponding to known morphological subfamilies. Further analysis of the most abundant Type 1 yields 10 Clusters characterized by consistent sets of head, neck and tail proteins. We developed Virfam, a webserver that automatically identifies proteins of the phage head-neck-tail module and assign phages to the most closely related cluster of phages. This server was tested against 624 new phages from the NCBI database. 93% of the tailed and unclassified phages could be assigned to our head-neck-tail based categories, thus highlighting the large representativeness of the identified virion architectures. Types and Clusters delineate consistent subgroups of Caudovirales, which correlate with several virion properties. CONCLUSIONS: Our method and webserver have the capacity to automatically classify most tailed phages, detect their structural module, assign a function to a set of their head, neck and tail genes, provide their morphologic subtype and localize these phages within a “head-neck-tail” based classification. It should enable analysis of large sets of phage genomes. In particular, it should contribute to the classification of the abundant unknown viruses found on assembled contigs of metagenomic samples. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-1027) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4362835
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43628352015-03-18 Automated classification of tailed bacteriophages according to their neck organization Lopes, Anne Tavares, Paulo Petit, Marie-Agnès Guérois, Raphaël Zinn-Justin, Sophie BMC Genomics Software BACKGROUND: The genetic diversity observed among bacteriophages remains a major obstacle for the identification of homologs and the comparison of their functional modules. In the structural module, although several classes of homologous proteins contributing to the head and tail structure can be detected, proteins of the head-to-tail connection (or neck) are generally more divergent. Yet, molecular analyses of a few tailed phages belonging to different morphological classes suggested that only a limited number of structural solutions are used in order to produce a functional virion. To challenge this hypothesis and analyze proteins diversity at the virion neck, we developed a specific computational strategy to cope with sequence divergence in phage proteins. We searched for homologs of a set of proteins encoded in the structural module using a phage learning database. RESULTS: We show that using a combination of iterative profile-profile comparison and gene context analyses, we can identify a set of head, neck and tail proteins in most tailed bacteriophages of our database. Classification of phages based on neck protein sequences delineates 4 Types corresponding to known morphological subfamilies. Further analysis of the most abundant Type 1 yields 10 Clusters characterized by consistent sets of head, neck and tail proteins. We developed Virfam, a webserver that automatically identifies proteins of the phage head-neck-tail module and assign phages to the most closely related cluster of phages. This server was tested against 624 new phages from the NCBI database. 93% of the tailed and unclassified phages could be assigned to our head-neck-tail based categories, thus highlighting the large representativeness of the identified virion architectures. Types and Clusters delineate consistent subgroups of Caudovirales, which correlate with several virion properties. CONCLUSIONS: Our method and webserver have the capacity to automatically classify most tailed phages, detect their structural module, assign a function to a set of their head, neck and tail genes, provide their morphologic subtype and localize these phages within a “head-neck-tail” based classification. It should enable analysis of large sets of phage genomes. In particular, it should contribute to the classification of the abundant unknown viruses found on assembled contigs of metagenomic samples. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-1027) contains supplementary material, which is available to authorized users. BioMed Central 2014-11-27 /pmc/articles/PMC4362835/ /pubmed/25428721 http://dx.doi.org/10.1186/1471-2164-15-1027 Text en © Lopes et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Lopes, Anne
Tavares, Paulo
Petit, Marie-Agnès
Guérois, Raphaël
Zinn-Justin, Sophie
Automated classification of tailed bacteriophages according to their neck organization
title Automated classification of tailed bacteriophages according to their neck organization
title_full Automated classification of tailed bacteriophages according to their neck organization
title_fullStr Automated classification of tailed bacteriophages according to their neck organization
title_full_unstemmed Automated classification of tailed bacteriophages according to their neck organization
title_short Automated classification of tailed bacteriophages according to their neck organization
title_sort automated classification of tailed bacteriophages according to their neck organization
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4362835/
https://www.ncbi.nlm.nih.gov/pubmed/25428721
http://dx.doi.org/10.1186/1471-2164-15-1027
work_keys_str_mv AT lopesanne automatedclassificationoftailedbacteriophagesaccordingtotheirneckorganization
AT tavarespaulo automatedclassificationoftailedbacteriophagesaccordingtotheirneckorganization
AT petitmarieagnes automatedclassificationoftailedbacteriophagesaccordingtotheirneckorganization
AT gueroisraphael automatedclassificationoftailedbacteriophagesaccordingtotheirneckorganization
AT zinnjustinsophie automatedclassificationoftailedbacteriophagesaccordingtotheirneckorganization