Cargando…
From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs
Background: Fast and computationally efficient strategies are required to explore genomic relationships within an increasingly large and diverse phage sequence space. Here, we present PhageClouds, a novel approach using a graph database of phage genomic sequences and their intergenomic distances to...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Mary Ann Liebert, Inc., publishers
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9041511/ https://www.ncbi.nlm.nih.gov/pubmed/36147515 http://dx.doi.org/10.1089/phage.2021.0008 |
_version_ | 1784694544019750912 |
---|---|
author | Rangel-Pineros, Guillermo Millard, Andrew Michniewski, Slawomir Scanlan, David Sirén, Kimmo Reyes, Alejandro Petersen, Bent Clokie, Martha R.J. Sicheritz-Pontén, Thomas |
author_facet | Rangel-Pineros, Guillermo Millard, Andrew Michniewski, Slawomir Scanlan, David Sirén, Kimmo Reyes, Alejandro Petersen, Bent Clokie, Martha R.J. Sicheritz-Pontén, Thomas |
author_sort | Rangel-Pineros, Guillermo |
collection | PubMed |
description | Background: Fast and computationally efficient strategies are required to explore genomic relationships within an increasingly large and diverse phage sequence space. Here, we present PhageClouds, a novel approach using a graph database of phage genomic sequences and their intergenomic distances to explore the phage genomic sequence space. Methods: A total of 640,000 phage genomic sequences were retrieved from a variety of databases and public virome assemblies. Intergenomic distances were calculated with dashing, an alignment-free method suitable for handling massive data sets. These data were used to build a Neo4j(®) graph database. Results: PhageClouds supported the search of related phages among all complete phage genomes from GenBank for a single query phage in just 10 s. Moreover, PhageClouds expanded the number of closely related phage sequences detected for both finished and draft phage genomes, in comparison with searches exclusively targeting phage entries from GenBank. Conclusions: PhageClouds is a novel resource that will facilitate the analysis of phage genomic sequences and the characterization of assembled phage genomes. |
format | Online Article Text |
id | pubmed-9041511 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Mary Ann Liebert, Inc., publishers |
record_format | MEDLINE/PubMed |
spelling | pubmed-90415112022-09-21 From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs Rangel-Pineros, Guillermo Millard, Andrew Michniewski, Slawomir Scanlan, David Sirén, Kimmo Reyes, Alejandro Petersen, Bent Clokie, Martha R.J. Sicheritz-Pontén, Thomas Phage (New Rochelle) Original Articles Background: Fast and computationally efficient strategies are required to explore genomic relationships within an increasingly large and diverse phage sequence space. Here, we present PhageClouds, a novel approach using a graph database of phage genomic sequences and their intergenomic distances to explore the phage genomic sequence space. Methods: A total of 640,000 phage genomic sequences were retrieved from a variety of databases and public virome assemblies. Intergenomic distances were calculated with dashing, an alignment-free method suitable for handling massive data sets. These data were used to build a Neo4j(®) graph database. Results: PhageClouds supported the search of related phages among all complete phage genomes from GenBank for a single query phage in just 10 s. Moreover, PhageClouds expanded the number of closely related phage sequences detected for both finished and draft phage genomes, in comparison with searches exclusively targeting phage entries from GenBank. Conclusions: PhageClouds is a novel resource that will facilitate the analysis of phage genomic sequences and the characterization of assembled phage genomes. Mary Ann Liebert, Inc., publishers 2021-12-01 2021-12-16 /pmc/articles/PMC9041511/ /pubmed/36147515 http://dx.doi.org/10.1089/phage.2021.0008 Text en © Guillermo Rangel-Pineros et al. 2021; Published by Mary Ann Liebert, Inc. https://creativecommons.org/licenses/by-nc/4.0/This Open Access article is distributed under the terms of the Creative Commons Attribution Noncommercial License [CC-BY-NC] (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ) which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and the source are cited. |
spellingShingle | Original Articles Rangel-Pineros, Guillermo Millard, Andrew Michniewski, Slawomir Scanlan, David Sirén, Kimmo Reyes, Alejandro Petersen, Bent Clokie, Martha R.J. Sicheritz-Pontén, Thomas From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs |
title | From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs |
title_full | From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs |
title_fullStr | From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs |
title_full_unstemmed | From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs |
title_short | From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs |
title_sort | from trees to clouds: phageclouds for fast comparison of ∼640,000 phage genomic sequences and host-centric visualization using genomic network graphs |
topic | Original Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9041511/ https://www.ncbi.nlm.nih.gov/pubmed/36147515 http://dx.doi.org/10.1089/phage.2021.0008 |
work_keys_str_mv | AT rangelpinerosguillermo fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs AT millardandrew fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs AT michniewskislawomir fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs AT scanlandavid fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs AT sirenkimmo fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs AT reyesalejandro fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs AT petersenbent fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs AT clokiemartharj fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs AT sicheritzpontenthomas fromtreestocloudsphagecloudsforfastcomparisonof640000phagegenomicsequencesandhostcentricvisualizationusinggenomicnetworkgraphs |