Cargando…
Conservation of Gene Cassettes among Diverse Viruses of the Human Gut
Viruses are a crucial component of the human microbiome, but large population sizes, high sequence diversity, and high frequencies of novel genes have hindered genomic analysis by high-throughput sequencing. Here we investigate approaches to metagenomic assembly to probe genome structure in a sample...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3416800/ https://www.ncbi.nlm.nih.gov/pubmed/22900013 http://dx.doi.org/10.1371/journal.pone.0042342 |
_version_ | 1782240448103841792 |
---|---|
author | Minot, Samuel Wu, Gary D. Lewis, James D. Bushman, Frederic D. |
author_facet | Minot, Samuel Wu, Gary D. Lewis, James D. Bushman, Frederic D. |
author_sort | Minot, Samuel |
collection | PubMed |
description | Viruses are a crucial component of the human microbiome, but large population sizes, high sequence diversity, and high frequencies of novel genes have hindered genomic analysis by high-throughput sequencing. Here we investigate approaches to metagenomic assembly to probe genome structure in a sample of 5.6 Gb of gut viral DNA sequence from six individuals. Tests showed that a new pipeline based on DeBruijn graph assembly yielded longer contigs that were able to recruit more reads than the equivalent non-optimized, single-pass approach. To characterize gene content, the database of viral RefSeq proteins was compared to the assembled viral contigs, generating a bipartite graph with functional cassettes linking together viral contigs, which revealed a high degree of connectivity between diverse genomes involving multiple genes of the same functional class. In a second step, open reading frames were grouped by their co-occurrence on contigs in a database-independent manner, revealing conserved cassettes of co-oriented ORFs. These methods reveal that free-living bacteriophages, while usually dissimilar at the nucleotide level, often have significant similarity at the level of encoded amino acid motifs, gene order, and gene orientation. These findings thus connect contemporary metagenomic analysis with classical studies of bacteriophage genomic cassettes. Software is available at https://sourceforge.net/projects/optitdba/. |
format | Online Article Text |
id | pubmed-3416800 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-34168002012-08-16 Conservation of Gene Cassettes among Diverse Viruses of the Human Gut Minot, Samuel Wu, Gary D. Lewis, James D. Bushman, Frederic D. PLoS One Research Article Viruses are a crucial component of the human microbiome, but large population sizes, high sequence diversity, and high frequencies of novel genes have hindered genomic analysis by high-throughput sequencing. Here we investigate approaches to metagenomic assembly to probe genome structure in a sample of 5.6 Gb of gut viral DNA sequence from six individuals. Tests showed that a new pipeline based on DeBruijn graph assembly yielded longer contigs that were able to recruit more reads than the equivalent non-optimized, single-pass approach. To characterize gene content, the database of viral RefSeq proteins was compared to the assembled viral contigs, generating a bipartite graph with functional cassettes linking together viral contigs, which revealed a high degree of connectivity between diverse genomes involving multiple genes of the same functional class. In a second step, open reading frames were grouped by their co-occurrence on contigs in a database-independent manner, revealing conserved cassettes of co-oriented ORFs. These methods reveal that free-living bacteriophages, while usually dissimilar at the nucleotide level, often have significant similarity at the level of encoded amino acid motifs, gene order, and gene orientation. These findings thus connect contemporary metagenomic analysis with classical studies of bacteriophage genomic cassettes. Software is available at https://sourceforge.net/projects/optitdba/. Public Library of Science 2012-08-10 /pmc/articles/PMC3416800/ /pubmed/22900013 http://dx.doi.org/10.1371/journal.pone.0042342 Text en © 2012 Minot et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Minot, Samuel Wu, Gary D. Lewis, James D. Bushman, Frederic D. Conservation of Gene Cassettes among Diverse Viruses of the Human Gut |
title | Conservation of Gene Cassettes among Diverse Viruses of the Human Gut |
title_full | Conservation of Gene Cassettes among Diverse Viruses of the Human Gut |
title_fullStr | Conservation of Gene Cassettes among Diverse Viruses of the Human Gut |
title_full_unstemmed | Conservation of Gene Cassettes among Diverse Viruses of the Human Gut |
title_short | Conservation of Gene Cassettes among Diverse Viruses of the Human Gut |
title_sort | conservation of gene cassettes among diverse viruses of the human gut |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3416800/ https://www.ncbi.nlm.nih.gov/pubmed/22900013 http://dx.doi.org/10.1371/journal.pone.0042342 |
work_keys_str_mv | AT minotsamuel conservationofgenecassettesamongdiversevirusesofthehumangut AT wugaryd conservationofgenecassettesamongdiversevirusesofthehumangut AT lewisjamesd conservationofgenecassettesamongdiversevirusesofthehumangut AT bushmanfredericd conservationofgenecassettesamongdiversevirusesofthehumangut |