Cargando…

Clustering of Giant Virus-DNA Based on Variations in Local Entropy

We present a method for clustering genomic sequences based on variations in local entropy. We have analyzed the distributions of the block entropies of viruses and plant genomes. A distinct pattern for viruses and plant genomes is observed. These distributions, which describe the local entropic vari...

Descripción completa

Detalles Bibliográficos
Autores principales: Bose, Ranjan, Thiel, Gerhard, Hamacher, Kay
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074927/
https://www.ncbi.nlm.nih.gov/pubmed/24887142
http://dx.doi.org/10.3390/v6062259
_version_ 1782323267089989632
author Bose, Ranjan
Thiel, Gerhard
Hamacher, Kay
author_facet Bose, Ranjan
Thiel, Gerhard
Hamacher, Kay
author_sort Bose, Ranjan
collection PubMed
description We present a method for clustering genomic sequences based on variations in local entropy. We have analyzed the distributions of the block entropies of viruses and plant genomes. A distinct pattern for viruses and plant genomes is observed. These distributions, which describe the local entropic variability of the genomes, are used for clustering the genomes based on the Jensen-Shannon (JS) distances. The analysis of the JS distances between all genomes that infect the chlorella algae shows the host specificity of the viruses. We illustrate the efficacy of this entropy-based clustering technique by the segregation of plant and virus genomes into separate bins.
format Online
Article
Text
id pubmed-4074927
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-40749272014-06-30 Clustering of Giant Virus-DNA Based on Variations in Local Entropy Bose, Ranjan Thiel, Gerhard Hamacher, Kay Viruses Article We present a method for clustering genomic sequences based on variations in local entropy. We have analyzed the distributions of the block entropies of viruses and plant genomes. A distinct pattern for viruses and plant genomes is observed. These distributions, which describe the local entropic variability of the genomes, are used for clustering the genomes based on the Jensen-Shannon (JS) distances. The analysis of the JS distances between all genomes that infect the chlorella algae shows the host specificity of the viruses. We illustrate the efficacy of this entropy-based clustering technique by the segregation of plant and virus genomes into separate bins. MDPI 2014-05-30 /pmc/articles/PMC4074927/ /pubmed/24887142 http://dx.doi.org/10.3390/v6062259 Text en © 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Bose, Ranjan
Thiel, Gerhard
Hamacher, Kay
Clustering of Giant Virus-DNA Based on Variations in Local Entropy
title Clustering of Giant Virus-DNA Based on Variations in Local Entropy
title_full Clustering of Giant Virus-DNA Based on Variations in Local Entropy
title_fullStr Clustering of Giant Virus-DNA Based on Variations in Local Entropy
title_full_unstemmed Clustering of Giant Virus-DNA Based on Variations in Local Entropy
title_short Clustering of Giant Virus-DNA Based on Variations in Local Entropy
title_sort clustering of giant virus-dna based on variations in local entropy
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074927/
https://www.ncbi.nlm.nih.gov/pubmed/24887142
http://dx.doi.org/10.3390/v6062259
work_keys_str_mv AT boseranjan clusteringofgiantvirusdnabasedonvariationsinlocalentropy
AT thielgerhard clusteringofgiantvirusdnabasedonvariationsinlocalentropy
AT hamacherkay clusteringofgiantvirusdnabasedonvariationsinlocalentropy