Cargando…

Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms

BACKGROUND: For typing of Staphylococcus aureus, DNA sequencing of the repeat region of the protein A (spa) gene is a well established discriminatory method for outbreak investigations. Recently, it was hypothesized that this region also reflects long-term epidemiology. However, no automated and obj...

Descripción completa

Detalles Bibliográficos
Autores principales: Mellmann, Alexander, Weniger, Thomas, Berssenbrügge, Christoph, Rothgänger, Jörg, Sammeth, Michael, Stoye, Jens, Harmsen, Dag
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2148047/
https://www.ncbi.nlm.nih.gov/pubmed/17967176
http://dx.doi.org/10.1186/1471-2180-7-98
_version_ 1782144496254844928
author Mellmann, Alexander
Weniger, Thomas
Berssenbrügge, Christoph
Rothgänger, Jörg
Sammeth, Michael
Stoye, Jens
Harmsen, Dag
author_facet Mellmann, Alexander
Weniger, Thomas
Berssenbrügge, Christoph
Rothgänger, Jörg
Sammeth, Michael
Stoye, Jens
Harmsen, Dag
author_sort Mellmann, Alexander
collection PubMed
description BACKGROUND: For typing of Staphylococcus aureus, DNA sequencing of the repeat region of the protein A (spa) gene is a well established discriminatory method for outbreak investigations. Recently, it was hypothesized that this region also reflects long-term epidemiology. However, no automated and objective algorithm existed to cluster different repeat regions. In this study, the Based Upon Repeat Pattern (BURP) implementation that is a heuristic variant of the newly described EDSI algorithm was investigated to infer the clonal relatedness of different spa types. For calibration of BURP parameters, 400 representative S. aureus strains with different spa types were characterized by MLST and clustered using eBURST as "gold standard" for their phylogeny. Typing concordance analysis between eBURST and BURP clustering (spa-CC) were performed using all possible BURP parameters to determine their optimal combination. BURP was subsequently evaluated with a strain collection reflecting the breadth of diversity of S. aureus (JCM 2002; 40:4544). RESULTS: In total, the 400 strains exhibited 122 different MLST types. eBURST grouped them into 23 clonal complexes (CC; 354 isolates) and 33 singletons (46 isolates). BURP clustering of spa types using all possible parameter combinations and subsequent comparison with eBURST CCs resulted in concordances ranging from 8.2 to 96.2%. However, 96.2% concordance was reached only if spa types shorter than 8 repeats were excluded, which resulted in 37% excluded spa types. Therefore, the optimal combination of the BURP parameters was "exclude spa types shorter than 5 repeats" and "cluster spa types into spa-CC if cost distances are less than 4" exhibiting 95.3% concordance to eBURST. This algorithm identified 24 spa-CCs, 40 singletons, and excluded only 7.8% spa types. Analyzing the natural population with these parameters, the comparison of whole-genome micro-array groupings (at the level of 0.31 Pearson correlation index) and spa-CCs gave a concordance of 87.1%; BURP spa-CCs vs. manually grouped spa types resulted in 95.7% concordance. CONCLUSION: BURP is the first automated and objective tool to infer clonal relatedness from spa repeat regions. It is able to extract an evolutionary signal rather congruent to MLST and micro-array data.
format Text
id pubmed-2148047
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-21480472007-12-20 Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms Mellmann, Alexander Weniger, Thomas Berssenbrügge, Christoph Rothgänger, Jörg Sammeth, Michael Stoye, Jens Harmsen, Dag BMC Microbiol Research Article BACKGROUND: For typing of Staphylococcus aureus, DNA sequencing of the repeat region of the protein A (spa) gene is a well established discriminatory method for outbreak investigations. Recently, it was hypothesized that this region also reflects long-term epidemiology. However, no automated and objective algorithm existed to cluster different repeat regions. In this study, the Based Upon Repeat Pattern (BURP) implementation that is a heuristic variant of the newly described EDSI algorithm was investigated to infer the clonal relatedness of different spa types. For calibration of BURP parameters, 400 representative S. aureus strains with different spa types were characterized by MLST and clustered using eBURST as "gold standard" for their phylogeny. Typing concordance analysis between eBURST and BURP clustering (spa-CC) were performed using all possible BURP parameters to determine their optimal combination. BURP was subsequently evaluated with a strain collection reflecting the breadth of diversity of S. aureus (JCM 2002; 40:4544). RESULTS: In total, the 400 strains exhibited 122 different MLST types. eBURST grouped them into 23 clonal complexes (CC; 354 isolates) and 33 singletons (46 isolates). BURP clustering of spa types using all possible parameter combinations and subsequent comparison with eBURST CCs resulted in concordances ranging from 8.2 to 96.2%. However, 96.2% concordance was reached only if spa types shorter than 8 repeats were excluded, which resulted in 37% excluded spa types. Therefore, the optimal combination of the BURP parameters was "exclude spa types shorter than 5 repeats" and "cluster spa types into spa-CC if cost distances are less than 4" exhibiting 95.3% concordance to eBURST. This algorithm identified 24 spa-CCs, 40 singletons, and excluded only 7.8% spa types. Analyzing the natural population with these parameters, the comparison of whole-genome micro-array groupings (at the level of 0.31 Pearson correlation index) and spa-CCs gave a concordance of 87.1%; BURP spa-CCs vs. manually grouped spa types resulted in 95.7% concordance. CONCLUSION: BURP is the first automated and objective tool to infer clonal relatedness from spa repeat regions. It is able to extract an evolutionary signal rather congruent to MLST and micro-array data. BioMed Central 2007-10-29 /pmc/articles/PMC2148047/ /pubmed/17967176 http://dx.doi.org/10.1186/1471-2180-7-98 Text en Copyright © 2007 Mellmann et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Mellmann, Alexander
Weniger, Thomas
Berssenbrügge, Christoph
Rothgänger, Jörg
Sammeth, Michael
Stoye, Jens
Harmsen, Dag
Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms
title Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms
title_full Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms
title_fullStr Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms
title_full_unstemmed Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms
title_short Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms
title_sort based upon repeat pattern (burp): an algorithm to characterize the long-term evolution of staphylococcus aureus populations based on spa polymorphisms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2148047/
https://www.ncbi.nlm.nih.gov/pubmed/17967176
http://dx.doi.org/10.1186/1471-2180-7-98
work_keys_str_mv AT mellmannalexander baseduponrepeatpatternburpanalgorithmtocharacterizethelongtermevolutionofstaphylococcusaureuspopulationsbasedonspapolymorphisms
AT wenigerthomas baseduponrepeatpatternburpanalgorithmtocharacterizethelongtermevolutionofstaphylococcusaureuspopulationsbasedonspapolymorphisms
AT berssenbruggechristoph baseduponrepeatpatternburpanalgorithmtocharacterizethelongtermevolutionofstaphylococcusaureuspopulationsbasedonspapolymorphisms
AT rothgangerjorg baseduponrepeatpatternburpanalgorithmtocharacterizethelongtermevolutionofstaphylococcusaureuspopulationsbasedonspapolymorphisms
AT sammethmichael baseduponrepeatpatternburpanalgorithmtocharacterizethelongtermevolutionofstaphylococcusaureuspopulationsbasedonspapolymorphisms
AT stoyejens baseduponrepeatpatternburpanalgorithmtocharacterizethelongtermevolutionofstaphylococcusaureuspopulationsbasedonspapolymorphisms
AT harmsendag baseduponrepeatpatternburpanalgorithmtocharacterizethelongtermevolutionofstaphylococcusaureuspopulationsbasedonspapolymorphisms