Cargando…

Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering

BACKGROUND: Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compare...

Descripción completa

Detalles Bibliográficos
Autores principales: Bohlin, Jon, Skjerve, Eystein, Ussery, David W
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2770534/
https://www.ncbi.nlm.nih.gov/pubmed/19845945
http://dx.doi.org/10.1186/1471-2164-10-487
_version_ 1782173677614268416
author Bohlin, Jon
Skjerve, Eystein
Ussery, David W
author_facet Bohlin, Jon
Skjerve, Eystein
Ussery, David W
author_sort Bohlin, Jon
collection PubMed
description BACKGROUND: Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. RESULTS: Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement. CONCLUSION: The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.
format Text
id pubmed-2770534
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27705342009-10-30 Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering Bohlin, Jon Skjerve, Eystein Ussery, David W BMC Genomics Research Article BACKGROUND: Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. RESULTS: Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement. CONCLUSION: The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level. BioMed Central 2009-10-21 /pmc/articles/PMC2770534/ /pubmed/19845945 http://dx.doi.org/10.1186/1471-2164-10-487 Text en Copyright © 2009 Bohlin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bohlin, Jon
Skjerve, Eystein
Ussery, David W
Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering
title Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering
title_full Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering
title_fullStr Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering
title_full_unstemmed Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering
title_short Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering
title_sort analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2770534/
https://www.ncbi.nlm.nih.gov/pubmed/19845945
http://dx.doi.org/10.1186/1471-2164-10-487
work_keys_str_mv AT bohlinjon analysisofgenomicsignaturesinprokaryotesusingmultinomialregressionandhierarchicalclustering
AT skjerveeystein analysisofgenomicsignaturesinprokaryotesusingmultinomialregressionandhierarchicalclustering
AT usserydavidw analysisofgenomicsignaturesinprokaryotesusingmultinomialregressionandhierarchicalclustering