Cargando…

The Average Mutual Information Profile as a Genomic Signature

BACKGROUND: Occult organizational structures in DNA sequences may hold the key to understanding functional and evolutionary aspects of the DNA molecule. Such structures can also provide the means for identifying and discriminating organisms using genomic data. Species specific genomic signatures are...

Descripción completa

Detalles Bibliográficos
Autores principales: Bauer, Mark, Schuster, Sheldon M, Sayood, Khalid
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2335307/
https://www.ncbi.nlm.nih.gov/pubmed/18218139
http://dx.doi.org/10.1186/1471-2105-9-48
_version_ 1782152821538291712
author Bauer, Mark
Schuster, Sheldon M
Sayood, Khalid
author_facet Bauer, Mark
Schuster, Sheldon M
Sayood, Khalid
author_sort Bauer, Mark
collection PubMed
description BACKGROUND: Occult organizational structures in DNA sequences may hold the key to understanding functional and evolutionary aspects of the DNA molecule. Such structures can also provide the means for identifying and discriminating organisms using genomic data. Species specific genomic signatures are useful in a variety of contexts such as evolutionary analysis, assembly and classification of genomic sequences from large uncultivated microbial communities and a rapid identification system in health hazard situations. RESULTS: We have analyzed genomic sequences of eukaryotic and prokaryotic chromosomes as well as various subtypes of viruses using an information theoretic framework. We confirm the existence of a species specific average mutual information (AMI) profile. We use these profiles to define a very simple, computationally efficient, alignment free, distance measure that reflects the evolutionary relationships between genomic sequences. We use this distance measure to classify chromosomes according to species of origin, to separate and cluster subtypes of the HIV-1 virus, and classify DNA fragments to species of origin. CONCLUSION: AMI profiles of DNA sequences prove to be species specific and easy to compute. The structure of AMI profiles are conserved, even in short subsequences of a species' genome, rendering a pervasive signature. This signature can be used to classify relatively short DNA fragments to species of origin.
format Text
id pubmed-2335307
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23353072008-04-28 The Average Mutual Information Profile as a Genomic Signature Bauer, Mark Schuster, Sheldon M Sayood, Khalid BMC Bioinformatics Research Article BACKGROUND: Occult organizational structures in DNA sequences may hold the key to understanding functional and evolutionary aspects of the DNA molecule. Such structures can also provide the means for identifying and discriminating organisms using genomic data. Species specific genomic signatures are useful in a variety of contexts such as evolutionary analysis, assembly and classification of genomic sequences from large uncultivated microbial communities and a rapid identification system in health hazard situations. RESULTS: We have analyzed genomic sequences of eukaryotic and prokaryotic chromosomes as well as various subtypes of viruses using an information theoretic framework. We confirm the existence of a species specific average mutual information (AMI) profile. We use these profiles to define a very simple, computationally efficient, alignment free, distance measure that reflects the evolutionary relationships between genomic sequences. We use this distance measure to classify chromosomes according to species of origin, to separate and cluster subtypes of the HIV-1 virus, and classify DNA fragments to species of origin. CONCLUSION: AMI profiles of DNA sequences prove to be species specific and easy to compute. The structure of AMI profiles are conserved, even in short subsequences of a species' genome, rendering a pervasive signature. This signature can be used to classify relatively short DNA fragments to species of origin. BioMed Central 2008-01-25 /pmc/articles/PMC2335307/ /pubmed/18218139 http://dx.doi.org/10.1186/1471-2105-9-48 Text en Copyright © 2008 Bauer et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bauer, Mark
Schuster, Sheldon M
Sayood, Khalid
The Average Mutual Information Profile as a Genomic Signature
title The Average Mutual Information Profile as a Genomic Signature
title_full The Average Mutual Information Profile as a Genomic Signature
title_fullStr The Average Mutual Information Profile as a Genomic Signature
title_full_unstemmed The Average Mutual Information Profile as a Genomic Signature
title_short The Average Mutual Information Profile as a Genomic Signature
title_sort average mutual information profile as a genomic signature
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2335307/
https://www.ncbi.nlm.nih.gov/pubmed/18218139
http://dx.doi.org/10.1186/1471-2105-9-48
work_keys_str_mv AT bauermark theaveragemutualinformationprofileasagenomicsignature
AT schustersheldonm theaveragemutualinformationprofileasagenomicsignature
AT sayoodkhalid theaveragemutualinformationprofileasagenomicsignature
AT bauermark averagemutualinformationprofileasagenomicsignature
AT schustersheldonm averagemutualinformationprofileasagenomicsignature
AT sayoodkhalid averagemutualinformationprofileasagenomicsignature