Cargando…

A Reference Pan-Genome Approach to Comparative Bacterial Genomics: Identification of Novel Epidemiological Markers in Pathogenic Campylobacter

The increasing availability of hundreds of whole bacterial genomes provides opportunities for enhanced understanding of the genes and alleles responsible for clinically important phenotypes and how they evolved. However, it is a significant challenge to develop easy-to-use and scalable methods for c...

Descripción completa

Detalles Bibliográficos
Autores principales: Méric, Guillaume, Yahara, Koji, Mageiros, Leonardos, Pascoe, Ben, Maiden, Martin C. J., Jolley, Keith A., Sheppard, Samuel K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3968026/
https://www.ncbi.nlm.nih.gov/pubmed/24676150
http://dx.doi.org/10.1371/journal.pone.0092798
_version_ 1782309099696816128
author Méric, Guillaume
Yahara, Koji
Mageiros, Leonardos
Pascoe, Ben
Maiden, Martin C. J.
Jolley, Keith A.
Sheppard, Samuel K.
author_facet Méric, Guillaume
Yahara, Koji
Mageiros, Leonardos
Pascoe, Ben
Maiden, Martin C. J.
Jolley, Keith A.
Sheppard, Samuel K.
author_sort Méric, Guillaume
collection PubMed
description The increasing availability of hundreds of whole bacterial genomes provides opportunities for enhanced understanding of the genes and alleles responsible for clinically important phenotypes and how they evolved. However, it is a significant challenge to develop easy-to-use and scalable methods for characterizing these large and complex data and relating it to disease epidemiology. Existing approaches typically focus on either homologous sequence variation in genes that are shared by all isolates, or non-homologous sequence variation - focusing on genes that are differentially present in the population. Here we present a comparative genomics approach that simultaneously approximates core and accessory genome variation in pathogen populations and apply it to pathogenic species in the genus Campylobacter. A total of 7 published Campylobacter jejuni and Campylobacter coli genomes were selected to represent diversity across these species, and a list of all loci that were present at least once was compiled. After filtering duplicates a 7-isolate reference pan-genome, of 3,933 loci, was defined. A core genome of 1,035 genes was ubiquitous in the sample accounting for 59% of the genes in each isolate (average genome size of 1.68 Mb). The accessory genome contained 2,792 genes. A Campylobacter population sample of 192 genomes was screened for the presence of reference pan-genome loci with gene presence defined as a BLAST match of ≥70% identity over ≥50% of the locus length - aligned using MUSCLE on a gene-by-gene basis. A total of 21 genes were present only in C. coli and 27 only in C. jejuni, providing information about functional differences associated with species and novel epidemiological markers for population genomic analyses. Homologs of these genes were found in several of the genomes used to define the pan-genome and, therefore, would not have been identified using a single reference strain approach.
format Online
Article
Text
id pubmed-3968026
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39680262014-04-01 A Reference Pan-Genome Approach to Comparative Bacterial Genomics: Identification of Novel Epidemiological Markers in Pathogenic Campylobacter Méric, Guillaume Yahara, Koji Mageiros, Leonardos Pascoe, Ben Maiden, Martin C. J. Jolley, Keith A. Sheppard, Samuel K. PLoS One Research Article The increasing availability of hundreds of whole bacterial genomes provides opportunities for enhanced understanding of the genes and alleles responsible for clinically important phenotypes and how they evolved. However, it is a significant challenge to develop easy-to-use and scalable methods for characterizing these large and complex data and relating it to disease epidemiology. Existing approaches typically focus on either homologous sequence variation in genes that are shared by all isolates, or non-homologous sequence variation - focusing on genes that are differentially present in the population. Here we present a comparative genomics approach that simultaneously approximates core and accessory genome variation in pathogen populations and apply it to pathogenic species in the genus Campylobacter. A total of 7 published Campylobacter jejuni and Campylobacter coli genomes were selected to represent diversity across these species, and a list of all loci that were present at least once was compiled. After filtering duplicates a 7-isolate reference pan-genome, of 3,933 loci, was defined. A core genome of 1,035 genes was ubiquitous in the sample accounting for 59% of the genes in each isolate (average genome size of 1.68 Mb). The accessory genome contained 2,792 genes. A Campylobacter population sample of 192 genomes was screened for the presence of reference pan-genome loci with gene presence defined as a BLAST match of ≥70% identity over ≥50% of the locus length - aligned using MUSCLE on a gene-by-gene basis. A total of 21 genes were present only in C. coli and 27 only in C. jejuni, providing information about functional differences associated with species and novel epidemiological markers for population genomic analyses. Homologs of these genes were found in several of the genomes used to define the pan-genome and, therefore, would not have been identified using a single reference strain approach. Public Library of Science 2014-03-27 /pmc/articles/PMC3968026/ /pubmed/24676150 http://dx.doi.org/10.1371/journal.pone.0092798 Text en © 2014 Méric et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Méric, Guillaume
Yahara, Koji
Mageiros, Leonardos
Pascoe, Ben
Maiden, Martin C. J.
Jolley, Keith A.
Sheppard, Samuel K.
A Reference Pan-Genome Approach to Comparative Bacterial Genomics: Identification of Novel Epidemiological Markers in Pathogenic Campylobacter
title A Reference Pan-Genome Approach to Comparative Bacterial Genomics: Identification of Novel Epidemiological Markers in Pathogenic Campylobacter
title_full A Reference Pan-Genome Approach to Comparative Bacterial Genomics: Identification of Novel Epidemiological Markers in Pathogenic Campylobacter
title_fullStr A Reference Pan-Genome Approach to Comparative Bacterial Genomics: Identification of Novel Epidemiological Markers in Pathogenic Campylobacter
title_full_unstemmed A Reference Pan-Genome Approach to Comparative Bacterial Genomics: Identification of Novel Epidemiological Markers in Pathogenic Campylobacter
title_short A Reference Pan-Genome Approach to Comparative Bacterial Genomics: Identification of Novel Epidemiological Markers in Pathogenic Campylobacter
title_sort reference pan-genome approach to comparative bacterial genomics: identification of novel epidemiological markers in pathogenic campylobacter
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3968026/
https://www.ncbi.nlm.nih.gov/pubmed/24676150
http://dx.doi.org/10.1371/journal.pone.0092798
work_keys_str_mv AT mericguillaume areferencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT yaharakoji areferencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT mageirosleonardos areferencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT pascoeben areferencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT maidenmartincj areferencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT jolleykeitha areferencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT sheppardsamuelk areferencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT mericguillaume referencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT yaharakoji referencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT mageirosleonardos referencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT pascoeben referencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT maidenmartincj referencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT jolleykeitha referencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter
AT sheppardsamuelk referencepangenomeapproachtocomparativebacterialgenomicsidentificationofnovelepidemiologicalmarkersinpathogeniccampylobacter