Cargando…

VIROME: a standard operating procedure for analysis of viral metagenome sequences

One consistent finding among studies using shotgun metagenomics to analyze whole viral communities is that most viral sequences show no significant homology to known sequences. Thus, bioinformatic analyses based on sequence collections such as GenBank nr, which are largely comprised of sequences fro...

Descripción completa

Detalles Bibliográficos
Autores principales: Wommack, K. Eric, Bhavsar, Jaysheel, Polson, Shawn W., Chen, Jing, Dumas, Michael, Srinivasiah, Sharath, Furman, Megan, Jamindar, Sanchita, Nasko, Daniel J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Michigan State University 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3558967/
https://www.ncbi.nlm.nih.gov/pubmed/23407591
http://dx.doi.org/10.4056/sigs.2945050
_version_ 1782257492017807360
author Wommack, K. Eric
Bhavsar, Jaysheel
Polson, Shawn W.
Chen, Jing
Dumas, Michael
Srinivasiah, Sharath
Furman, Megan
Jamindar, Sanchita
Nasko, Daniel J.
author_facet Wommack, K. Eric
Bhavsar, Jaysheel
Polson, Shawn W.
Chen, Jing
Dumas, Michael
Srinivasiah, Sharath
Furman, Megan
Jamindar, Sanchita
Nasko, Daniel J.
author_sort Wommack, K. Eric
collection PubMed
description One consistent finding among studies using shotgun metagenomics to analyze whole viral communities is that most viral sequences show no significant homology to known sequences. Thus, bioinformatic analyses based on sequence collections such as GenBank nr, which are largely comprised of sequences from known organisms, tend to ignore a majority of sequences within most shotgun viral metagenome libraries. Here we describe a bioinformatic pipeline, the Viral Informatics Resource for Metagenome Exploration (VIROME), that emphasizes the classification of viral metagenome sequences (predicted open-reading frames) based on homology search results against both known and environmental sequences. Functional and taxonomic information is derived from five annotated sequence databases which are linked to the UniRef 100 database. Environmental classifications are obtained from hits against a custom database, MetaGenomes On-Line, which contains 49 million predicted environmental peptides. Each predicted viral metagenomic ORF run through the VIROME pipeline is placed into one of seven ORF classes, thus, every sequence receives a meaningful annotation. Additionally, the pipeline includes quality control measures to remove contaminating and poor quality sequence and assesses the potential amount of cellular DNA contamination in a viral metagenome library by screening for rRNA genes. Access to the VIROME pipeline and analysis results are provided through a web-application interface that is dynamically linked to a relational back-end database. The VIROME web-application interface is designed to allow users flexibility in retrieving sequences (reads, ORFs, predicted peptides) and search results for focused secondary analyses.
format Online
Article
Text
id pubmed-3558967
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Michigan State University
record_format MEDLINE/PubMed
spelling pubmed-35589672013-02-13 VIROME: a standard operating procedure for analysis of viral metagenome sequences Wommack, K. Eric Bhavsar, Jaysheel Polson, Shawn W. Chen, Jing Dumas, Michael Srinivasiah, Sharath Furman, Megan Jamindar, Sanchita Nasko, Daniel J. Stand Genomic Sci Standard Operating Procedures One consistent finding among studies using shotgun metagenomics to analyze whole viral communities is that most viral sequences show no significant homology to known sequences. Thus, bioinformatic analyses based on sequence collections such as GenBank nr, which are largely comprised of sequences from known organisms, tend to ignore a majority of sequences within most shotgun viral metagenome libraries. Here we describe a bioinformatic pipeline, the Viral Informatics Resource for Metagenome Exploration (VIROME), that emphasizes the classification of viral metagenome sequences (predicted open-reading frames) based on homology search results against both known and environmental sequences. Functional and taxonomic information is derived from five annotated sequence databases which are linked to the UniRef 100 database. Environmental classifications are obtained from hits against a custom database, MetaGenomes On-Line, which contains 49 million predicted environmental peptides. Each predicted viral metagenomic ORF run through the VIROME pipeline is placed into one of seven ORF classes, thus, every sequence receives a meaningful annotation. Additionally, the pipeline includes quality control measures to remove contaminating and poor quality sequence and assesses the potential amount of cellular DNA contamination in a viral metagenome library by screening for rRNA genes. Access to the VIROME pipeline and analysis results are provided through a web-application interface that is dynamically linked to a relational back-end database. The VIROME web-application interface is designed to allow users flexibility in retrieving sequences (reads, ORFs, predicted peptides) and search results for focused secondary analyses. Michigan State University 2012-07-27 /pmc/articles/PMC3558967/ /pubmed/23407591 http://dx.doi.org/10.4056/sigs.2945050 Text en Copyright © retained by original authors. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Standard Operating Procedures
Wommack, K. Eric
Bhavsar, Jaysheel
Polson, Shawn W.
Chen, Jing
Dumas, Michael
Srinivasiah, Sharath
Furman, Megan
Jamindar, Sanchita
Nasko, Daniel J.
VIROME: a standard operating procedure for analysis of viral metagenome sequences
title VIROME: a standard operating procedure for analysis of viral metagenome sequences
title_full VIROME: a standard operating procedure for analysis of viral metagenome sequences
title_fullStr VIROME: a standard operating procedure for analysis of viral metagenome sequences
title_full_unstemmed VIROME: a standard operating procedure for analysis of viral metagenome sequences
title_short VIROME: a standard operating procedure for analysis of viral metagenome sequences
title_sort virome: a standard operating procedure for analysis of viral metagenome sequences
topic Standard Operating Procedures
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3558967/
https://www.ncbi.nlm.nih.gov/pubmed/23407591
http://dx.doi.org/10.4056/sigs.2945050
work_keys_str_mv AT wommackkeric viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences
AT bhavsarjaysheel viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences
AT polsonshawnw viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences
AT chenjing viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences
AT dumasmichael viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences
AT srinivasiahsharath viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences
AT furmanmegan viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences
AT jamindarsanchita viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences
AT naskodanielj viromeastandardoperatingprocedureforanalysisofviralmetagenomesequences