Cargando…

Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains

BACKGROUND: The distributed genome hypothesis (DGH) posits that chronic bacterial pathogens utilize polyclonal infection and reassortment of genic characters to ensure persistence in the face of adaptive host defenses. Studies based on random sequencing of multiple strain libraries suggested that fr...

Descripción completa

Detalles Bibliográficos
Autores principales: Hogg, Justin S, Hu, Fen Z, Janto, Benjamin, Boissy, Robert, Hayes, Jay, Keefe, Randy, Post, J Christopher, Ehrlich, Garth D
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2394751/
https://www.ncbi.nlm.nih.gov/pubmed/17550610
http://dx.doi.org/10.1186/gb-2007-8-6-r103
_version_ 1782155441162158080
author Hogg, Justin S
Hu, Fen Z
Janto, Benjamin
Boissy, Robert
Hayes, Jay
Keefe, Randy
Post, J Christopher
Ehrlich, Garth D
author_facet Hogg, Justin S
Hu, Fen Z
Janto, Benjamin
Boissy, Robert
Hayes, Jay
Keefe, Randy
Post, J Christopher
Ehrlich, Garth D
author_sort Hogg, Justin S
collection PubMed
description BACKGROUND: The distributed genome hypothesis (DGH) posits that chronic bacterial pathogens utilize polyclonal infection and reassortment of genic characters to ensure persistence in the face of adaptive host defenses. Studies based on random sequencing of multiple strain libraries suggested that free-living bacterial species possess a supragenome that is much larger than the genome of any single bacterium. RESULTS: We derived high depth genomic coverage of nine nontypeable Haemophilus influenzae (NTHi) clinical isolates, bringing to 13 the number of sequenced NTHi genomes. Clustering identified 2,786 genes, of which 1,461 were common to all strains, with each of the remaining 1,328 found in a subset of strains; the number of clusters ranged from 1,686 to 1,878 per strain. Genic differences of between 96 and 585 were identified per strain pair. Comparisons of each of the NTHi strains with the Rd strain revealed between 107 and 158 insertions and 100 and 213 deletions per genome. The mean insertion and deletion sizes were 1,356 and 1,020 base-pairs, respectively, with mean maximum insertions and deletions of 26,977 and 37,299 base-pairs. This relatively large number of small rearrangements among strains is in keeping with what is known about the transformation mechanisms in this naturally competent pathogen. CONCLUSION: A finite supragenome model was developed to explain the distribution of genes among strains. The model predicts that the NTHi supragenome contains between 4,425 and 6,052 genes with most uncertainty regarding the number of rare genes, those that have a frequency of <0.1 among strains; collectively, these results support the DGH.
format Text
id pubmed-2394751
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23947512008-05-29 Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains Hogg, Justin S Hu, Fen Z Janto, Benjamin Boissy, Robert Hayes, Jay Keefe, Randy Post, J Christopher Ehrlich, Garth D Genome Biol Research BACKGROUND: The distributed genome hypothesis (DGH) posits that chronic bacterial pathogens utilize polyclonal infection and reassortment of genic characters to ensure persistence in the face of adaptive host defenses. Studies based on random sequencing of multiple strain libraries suggested that free-living bacterial species possess a supragenome that is much larger than the genome of any single bacterium. RESULTS: We derived high depth genomic coverage of nine nontypeable Haemophilus influenzae (NTHi) clinical isolates, bringing to 13 the number of sequenced NTHi genomes. Clustering identified 2,786 genes, of which 1,461 were common to all strains, with each of the remaining 1,328 found in a subset of strains; the number of clusters ranged from 1,686 to 1,878 per strain. Genic differences of between 96 and 585 were identified per strain pair. Comparisons of each of the NTHi strains with the Rd strain revealed between 107 and 158 insertions and 100 and 213 deletions per genome. The mean insertion and deletion sizes were 1,356 and 1,020 base-pairs, respectively, with mean maximum insertions and deletions of 26,977 and 37,299 base-pairs. This relatively large number of small rearrangements among strains is in keeping with what is known about the transformation mechanisms in this naturally competent pathogen. CONCLUSION: A finite supragenome model was developed to explain the distribution of genes among strains. The model predicts that the NTHi supragenome contains between 4,425 and 6,052 genes with most uncertainty regarding the number of rare genes, those that have a frequency of <0.1 among strains; collectively, these results support the DGH. BioMed Central 2007 2007-06-05 /pmc/articles/PMC2394751/ /pubmed/17550610 http://dx.doi.org/10.1186/gb-2007-8-6-r103 Text en Copyright © 2007 Hogg et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Hogg, Justin S
Hu, Fen Z
Janto, Benjamin
Boissy, Robert
Hayes, Jay
Keefe, Randy
Post, J Christopher
Ehrlich, Garth D
Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains
title Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains
title_full Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains
title_fullStr Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains
title_full_unstemmed Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains
title_short Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains
title_sort characterization and modeling of the haemophilus influenzae core and supragenomes based on the complete genomic sequences of rd and 12 clinical nontypeable strains
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2394751/
https://www.ncbi.nlm.nih.gov/pubmed/17550610
http://dx.doi.org/10.1186/gb-2007-8-6-r103
work_keys_str_mv AT hoggjustins characterizationandmodelingofthehaemophilusinfluenzaecoreandsupragenomesbasedonthecompletegenomicsequencesofrdand12clinicalnontypeablestrains
AT hufenz characterizationandmodelingofthehaemophilusinfluenzaecoreandsupragenomesbasedonthecompletegenomicsequencesofrdand12clinicalnontypeablestrains
AT jantobenjamin characterizationandmodelingofthehaemophilusinfluenzaecoreandsupragenomesbasedonthecompletegenomicsequencesofrdand12clinicalnontypeablestrains
AT boissyrobert characterizationandmodelingofthehaemophilusinfluenzaecoreandsupragenomesbasedonthecompletegenomicsequencesofrdand12clinicalnontypeablestrains
AT hayesjay characterizationandmodelingofthehaemophilusinfluenzaecoreandsupragenomesbasedonthecompletegenomicsequencesofrdand12clinicalnontypeablestrains
AT keeferandy characterizationandmodelingofthehaemophilusinfluenzaecoreandsupragenomesbasedonthecompletegenomicsequencesofrdand12clinicalnontypeablestrains
AT postjchristopher characterizationandmodelingofthehaemophilusinfluenzaecoreandsupragenomesbasedonthecompletegenomicsequencesofrdand12clinicalnontypeablestrains
AT ehrlichgarthd characterizationandmodelingofthehaemophilusinfluenzaecoreandsupragenomesbasedonthecompletegenomicsequencesofrdand12clinicalnontypeablestrains