Cargando…

Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages

Background The genomic sequences of mycobacteriophages, phages infecting mycobacterial hosts, are diverse and mosaic. Mycobacteriophages often share little nucleotide similarity, but most of them have been grouped into lettered clusters and further into subclusters. Traditionally, mycobacteriophage...

Descripción completa

Detalles Bibliográficos
Autores principales: Siranosian, Benjamin, Perera, Sudheesha, Williams, Edward, Ye, Chen, de Graffenried, Christopher, Shank, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4841201/
https://www.ncbi.nlm.nih.gov/pubmed/27134721
http://dx.doi.org/10.12688/f1000research.6077.2
_version_ 1782428362091790336
author Siranosian, Benjamin
Perera, Sudheesha
Williams, Edward
Ye, Chen
de Graffenried, Christopher
Shank, Peter
author_facet Siranosian, Benjamin
Perera, Sudheesha
Williams, Edward
Ye, Chen
de Graffenried, Christopher
Shank, Peter
author_sort Siranosian, Benjamin
collection PubMed
description Background The genomic sequences of mycobacteriophages, phages infecting mycobacterial hosts, are diverse and mosaic. Mycobacteriophages often share little nucleotide similarity, but most of them have been grouped into lettered clusters and further into subclusters. Traditionally, mycobacteriophage genomes are analyzed based on sequence alignment or knowledge of gene content. However, these approaches are computationally expensive and can be ineffective for significantly diverged sequences. As an alternative to alignment-based genome analysis, we evaluated tetranucleotide usage in mycobacteriophage genomes. These methods make it easier to characterize features of the mycobacteriophage population at many scales. Description We computed tetranucleotide usage deviation (TUD), the ratio of observed counts of 4-mers in a genome to the expected count under a null model. TUD values are comparable between members of a phage subcluster and distinct between subclusters. With few exceptions, neighbor joining phylogenetic trees and hierarchical clustering dendrograms constructed using TUD values place phages in a monophyletic clade with members of the same subcluster. Regions in a genome with exceptional TUD values can point to interesting features of genomic architecture. Finally, we found that subcluster B3 mycobacteriophages contain significantly overrepresented 4-mers and 6-mers that are atypical of phage genomes. Conclusions Statistics based on tetranucleotide usage support established clustering of mycobacteriophages and can uncover interesting relationships within and between sequenced phage genomes. These methods are efficient to compute and do not require sequence alignment or knowledge of gene content. The code to download mycobacteriophage genome sequences and reproduce our analysis is freely available at https://github.com/bsiranosian/tango_final.
format Online
Article
Text
id pubmed-4841201
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-48412012016-04-29 Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages Siranosian, Benjamin Perera, Sudheesha Williams, Edward Ye, Chen de Graffenried, Christopher Shank, Peter F1000Res Research Article Background The genomic sequences of mycobacteriophages, phages infecting mycobacterial hosts, are diverse and mosaic. Mycobacteriophages often share little nucleotide similarity, but most of them have been grouped into lettered clusters and further into subclusters. Traditionally, mycobacteriophage genomes are analyzed based on sequence alignment or knowledge of gene content. However, these approaches are computationally expensive and can be ineffective for significantly diverged sequences. As an alternative to alignment-based genome analysis, we evaluated tetranucleotide usage in mycobacteriophage genomes. These methods make it easier to characterize features of the mycobacteriophage population at many scales. Description We computed tetranucleotide usage deviation (TUD), the ratio of observed counts of 4-mers in a genome to the expected count under a null model. TUD values are comparable between members of a phage subcluster and distinct between subclusters. With few exceptions, neighbor joining phylogenetic trees and hierarchical clustering dendrograms constructed using TUD values place phages in a monophyletic clade with members of the same subcluster. Regions in a genome with exceptional TUD values can point to interesting features of genomic architecture. Finally, we found that subcluster B3 mycobacteriophages contain significantly overrepresented 4-mers and 6-mers that are atypical of phage genomes. Conclusions Statistics based on tetranucleotide usage support established clustering of mycobacteriophages and can uncover interesting relationships within and between sequenced phage genomes. These methods are efficient to compute and do not require sequence alignment or knowledge of gene content. The code to download mycobacteriophage genome sequences and reproduce our analysis is freely available at https://github.com/bsiranosian/tango_final. F1000Research 2015-10-30 /pmc/articles/PMC4841201/ /pubmed/27134721 http://dx.doi.org/10.12688/f1000research.6077.2 Text en Copyright: © 2015 Siranosian B et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Siranosian, Benjamin
Perera, Sudheesha
Williams, Edward
Ye, Chen
de Graffenried, Christopher
Shank, Peter
Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages
title Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages
title_full Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages
title_fullStr Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages
title_full_unstemmed Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages
title_short Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages
title_sort tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4841201/
https://www.ncbi.nlm.nih.gov/pubmed/27134721
http://dx.doi.org/10.12688/f1000research.6077.2
work_keys_str_mv AT siranosianbenjamin tetranucleotideusagehighlightsgenomicheterogeneityamongmycobacteriophages
AT pererasudheesha tetranucleotideusagehighlightsgenomicheterogeneityamongmycobacteriophages
AT williamsedward tetranucleotideusagehighlightsgenomicheterogeneityamongmycobacteriophages
AT yechen tetranucleotideusagehighlightsgenomicheterogeneityamongmycobacteriophages
AT degraffenriedchristopher tetranucleotideusagehighlightsgenomicheterogeneityamongmycobacteriophages
AT shankpeter tetranucleotideusagehighlightsgenomicheterogeneityamongmycobacteriophages