Cargando…

PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals

In this paper we study the problem of clustering bacterial isolates into epidemiologically related groups from next-generation sequencing data. Existing methods for this problem mainly use a single genotyping signal, and either use a distance-based method with a pre-specified number of clusters, or...

Descripción completa

Detalles Bibliográficos
Autores principales: Katebi, Mohsen, Feijao, Pedro, Booth, Julius, Mansouri, Mehrdad, La, Sean, Sweeten, Alex, Miraskarshahi, Reza, Nguyen, Matthew, Wong, Johnathan, Hsiao, William, Chauve, Cedric, Chindelevitch, Leonid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7197062/
http://dx.doi.org/10.1007/978-3-030-42266-0_9
Descripción
Sumario:In this paper we study the problem of clustering bacterial isolates into epidemiologically related groups from next-generation sequencing data. Existing methods for this problem mainly use a single genotyping signal, and either use a distance-based method with a pre-specified number of clusters, or a phylogenetic tree-based method with a pre-specified threshold. We propose PathOGiST, an algorithmic framework for clustering bacterial isolates by leveraging multiple genotypic signals and calibrated thresholds. PathOGiST uses different genotypic signals, clusters the isolates based on these individual signals with correlation clustering, and combines the clusterings based on the individual signals through consensus clustering. We implemented and tested PathOGiST on three different bacterial pathogens - Escherichia coli, Yersinia pseudotuberculosis, and Mycobacterium tuberculosis - and we conclude by discussing further avenues to explore.