Cargando…

PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals

In this paper we study the problem of clustering bacterial isolates into epidemiologically related groups from next-generation sequencing data. Existing methods for this problem mainly use a single genotyping signal, and either use a distance-based method with a pre-specified number of clusters, or...

Descripción completa

Detalles Bibliográficos
Autores principales: Katebi, Mohsen, Feijao, Pedro, Booth, Julius, Mansouri, Mehrdad, La, Sean, Sweeten, Alex, Miraskarshahi, Reza, Nguyen, Matthew, Wong, Johnathan, Hsiao, William, Chauve, Cedric, Chindelevitch, Leonid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7197062/
http://dx.doi.org/10.1007/978-3-030-42266-0_9
_version_ 1783528809620832256
author Katebi, Mohsen
Feijao, Pedro
Booth, Julius
Mansouri, Mehrdad
La, Sean
Sweeten, Alex
Miraskarshahi, Reza
Nguyen, Matthew
Wong, Johnathan
Hsiao, William
Chauve, Cedric
Chindelevitch, Leonid
author_facet Katebi, Mohsen
Feijao, Pedro
Booth, Julius
Mansouri, Mehrdad
La, Sean
Sweeten, Alex
Miraskarshahi, Reza
Nguyen, Matthew
Wong, Johnathan
Hsiao, William
Chauve, Cedric
Chindelevitch, Leonid
author_sort Katebi, Mohsen
collection PubMed
description In this paper we study the problem of clustering bacterial isolates into epidemiologically related groups from next-generation sequencing data. Existing methods for this problem mainly use a single genotyping signal, and either use a distance-based method with a pre-specified number of clusters, or a phylogenetic tree-based method with a pre-specified threshold. We propose PathOGiST, an algorithmic framework for clustering bacterial isolates by leveraging multiple genotypic signals and calibrated thresholds. PathOGiST uses different genotypic signals, clusters the isolates based on these individual signals with correlation clustering, and combines the clusterings based on the individual signals through consensus clustering. We implemented and tested PathOGiST on three different bacterial pathogens - Escherichia coli, Yersinia pseudotuberculosis, and Mycobacterium tuberculosis - and we conclude by discussing further avenues to explore.
format Online
Article
Text
id pubmed-7197062
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-71970622020-05-04 PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals Katebi, Mohsen Feijao, Pedro Booth, Julius Mansouri, Mehrdad La, Sean Sweeten, Alex Miraskarshahi, Reza Nguyen, Matthew Wong, Johnathan Hsiao, William Chauve, Cedric Chindelevitch, Leonid Algorithms for Computational Biology Article In this paper we study the problem of clustering bacterial isolates into epidemiologically related groups from next-generation sequencing data. Existing methods for this problem mainly use a single genotyping signal, and either use a distance-based method with a pre-specified number of clusters, or a phylogenetic tree-based method with a pre-specified threshold. We propose PathOGiST, an algorithmic framework for clustering bacterial isolates by leveraging multiple genotypic signals and calibrated thresholds. PathOGiST uses different genotypic signals, clusters the isolates based on these individual signals with correlation clustering, and combines the clusterings based on the individual signals through consensus clustering. We implemented and tested PathOGiST on three different bacterial pathogens - Escherichia coli, Yersinia pseudotuberculosis, and Mycobacterium tuberculosis - and we conclude by discussing further avenues to explore. 2020-02-01 /pmc/articles/PMC7197062/ http://dx.doi.org/10.1007/978-3-030-42266-0_9 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Katebi, Mohsen
Feijao, Pedro
Booth, Julius
Mansouri, Mehrdad
La, Sean
Sweeten, Alex
Miraskarshahi, Reza
Nguyen, Matthew
Wong, Johnathan
Hsiao, William
Chauve, Cedric
Chindelevitch, Leonid
PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals
title PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals
title_full PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals
title_fullStr PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals
title_full_unstemmed PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals
title_short PathOGiST: A Novel Method for Clustering Pathogen Isolates by Combining Multiple Genotyping Signals
title_sort pathogist: a novel method for clustering pathogen isolates by combining multiple genotyping signals
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7197062/
http://dx.doi.org/10.1007/978-3-030-42266-0_9
work_keys_str_mv AT katebimohsen pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT feijaopedro pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT boothjulius pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT mansourimehrdad pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT lasean pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT sweetenalex pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT miraskarshahireza pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT nguyenmatthew pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT wongjohnathan pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT hsiaowilliam pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT chauvecedric pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals
AT chindelevitchleonid pathogistanovelmethodforclusteringpathogenisolatesbycombiningmultiplegenotypingsignals