Cargando…

Conservation of Gene Cassettes among Diverse Viruses of the Human Gut

Viruses are a crucial component of the human microbiome, but large population sizes, high sequence diversity, and high frequencies of novel genes have hindered genomic analysis by high-throughput sequencing. Here we investigate approaches to metagenomic assembly to probe genome structure in a sample...

Descripción completa

Detalles Bibliográficos
Autores principales: Minot, Samuel, Wu, Gary D., Lewis, James D., Bushman, Frederic D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3416800/
https://www.ncbi.nlm.nih.gov/pubmed/22900013
http://dx.doi.org/10.1371/journal.pone.0042342
_version_ 1782240448103841792
author Minot, Samuel
Wu, Gary D.
Lewis, James D.
Bushman, Frederic D.
author_facet Minot, Samuel
Wu, Gary D.
Lewis, James D.
Bushman, Frederic D.
author_sort Minot, Samuel
collection PubMed
description Viruses are a crucial component of the human microbiome, but large population sizes, high sequence diversity, and high frequencies of novel genes have hindered genomic analysis by high-throughput sequencing. Here we investigate approaches to metagenomic assembly to probe genome structure in a sample of 5.6 Gb of gut viral DNA sequence from six individuals. Tests showed that a new pipeline based on DeBruijn graph assembly yielded longer contigs that were able to recruit more reads than the equivalent non-optimized, single-pass approach. To characterize gene content, the database of viral RefSeq proteins was compared to the assembled viral contigs, generating a bipartite graph with functional cassettes linking together viral contigs, which revealed a high degree of connectivity between diverse genomes involving multiple genes of the same functional class. In a second step, open reading frames were grouped by their co-occurrence on contigs in a database-independent manner, revealing conserved cassettes of co-oriented ORFs. These methods reveal that free-living bacteriophages, while usually dissimilar at the nucleotide level, often have significant similarity at the level of encoded amino acid motifs, gene order, and gene orientation. These findings thus connect contemporary metagenomic analysis with classical studies of bacteriophage genomic cassettes. Software is available at https://sourceforge.net/projects/optitdba/.
format Online
Article
Text
id pubmed-3416800
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34168002012-08-16 Conservation of Gene Cassettes among Diverse Viruses of the Human Gut Minot, Samuel Wu, Gary D. Lewis, James D. Bushman, Frederic D. PLoS One Research Article Viruses are a crucial component of the human microbiome, but large population sizes, high sequence diversity, and high frequencies of novel genes have hindered genomic analysis by high-throughput sequencing. Here we investigate approaches to metagenomic assembly to probe genome structure in a sample of 5.6 Gb of gut viral DNA sequence from six individuals. Tests showed that a new pipeline based on DeBruijn graph assembly yielded longer contigs that were able to recruit more reads than the equivalent non-optimized, single-pass approach. To characterize gene content, the database of viral RefSeq proteins was compared to the assembled viral contigs, generating a bipartite graph with functional cassettes linking together viral contigs, which revealed a high degree of connectivity between diverse genomes involving multiple genes of the same functional class. In a second step, open reading frames were grouped by their co-occurrence on contigs in a database-independent manner, revealing conserved cassettes of co-oriented ORFs. These methods reveal that free-living bacteriophages, while usually dissimilar at the nucleotide level, often have significant similarity at the level of encoded amino acid motifs, gene order, and gene orientation. These findings thus connect contemporary metagenomic analysis with classical studies of bacteriophage genomic cassettes. Software is available at https://sourceforge.net/projects/optitdba/. Public Library of Science 2012-08-10 /pmc/articles/PMC3416800/ /pubmed/22900013 http://dx.doi.org/10.1371/journal.pone.0042342 Text en © 2012 Minot et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Minot, Samuel
Wu, Gary D.
Lewis, James D.
Bushman, Frederic D.
Conservation of Gene Cassettes among Diverse Viruses of the Human Gut
title Conservation of Gene Cassettes among Diverse Viruses of the Human Gut
title_full Conservation of Gene Cassettes among Diverse Viruses of the Human Gut
title_fullStr Conservation of Gene Cassettes among Diverse Viruses of the Human Gut
title_full_unstemmed Conservation of Gene Cassettes among Diverse Viruses of the Human Gut
title_short Conservation of Gene Cassettes among Diverse Viruses of the Human Gut
title_sort conservation of gene cassettes among diverse viruses of the human gut
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3416800/
https://www.ncbi.nlm.nih.gov/pubmed/22900013
http://dx.doi.org/10.1371/journal.pone.0042342
work_keys_str_mv AT minotsamuel conservationofgenecassettesamongdiversevirusesofthehumangut
AT wugaryd conservationofgenecassettesamongdiversevirusesofthehumangut
AT lewisjamesd conservationofgenecassettesamongdiversevirusesofthehumangut
AT bushmanfredericd conservationofgenecassettesamongdiversevirusesofthehumangut