Cargando…

Optimizing a Whole-Genome Sequencing Data Processing Pipeline for Precision Surveillance of Health Care-Associated Infections

The surveillance of health care-associated infection (HAI) is an essential element of the infection control program. While whole-genome sequencing (WGS) has widely been adopted for genomic surveillance, its data processing remains to be improved. Here, we propose a three-level data processing pipeli...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Weihua, Wang, Guiqing, Yin, Changhong, Chen, Donald, Dhand, Abhay, Chanza, Melissa, Dimitrova, Nevenka, Fallon, John T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6843764/
https://www.ncbi.nlm.nih.gov/pubmed/31554234
http://dx.doi.org/10.3390/microorganisms7100388
_version_ 1783468291954573312
author Huang, Weihua
Wang, Guiqing
Yin, Changhong
Chen, Donald
Dhand, Abhay
Chanza, Melissa
Dimitrova, Nevenka
Fallon, John T.
author_facet Huang, Weihua
Wang, Guiqing
Yin, Changhong
Chen, Donald
Dhand, Abhay
Chanza, Melissa
Dimitrova, Nevenka
Fallon, John T.
author_sort Huang, Weihua
collection PubMed
description The surveillance of health care-associated infection (HAI) is an essential element of the infection control program. While whole-genome sequencing (WGS) has widely been adopted for genomic surveillance, its data processing remains to be improved. Here, we propose a three-level data processing pipeline for the precision genomic surveillance of microorganisms without prior knowledge: species identification, multi-locus sequence typing (MLST), and sub-MLST clustering. The former two are closely connected to what have widely been used in current clinical microbiology laboratories, whereas the latter one provides significantly improved resolution and accuracy in genomic surveillance. Comparing to a broadly used reference-dependent alignment/mapping method and an annotation-dependent pan-/core-genome analysis, we implemented our reference- and annotation-independent, k-mer-based, simplified workflow to a collection of Acinetobacter and Enterococcus clinical isolates for tests. By taking both single nucleotide variants and genomic structural changes into account, the optimized k-mer-based pipeline demonstrated a global view of bacterial population structure in a rapid manner and discriminated the relatedness between bacterial isolates in more detail and precision. The newly developed WGS data processing pipeline would facilitate WGS application to the precision genomic surveillance of HAI. In addition, the results from such a WGS-based analysis would be useful for the precision laboratory diagnosis of infectious microorganisms.
format Online
Article
Text
id pubmed-6843764
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-68437642019-11-25 Optimizing a Whole-Genome Sequencing Data Processing Pipeline for Precision Surveillance of Health Care-Associated Infections Huang, Weihua Wang, Guiqing Yin, Changhong Chen, Donald Dhand, Abhay Chanza, Melissa Dimitrova, Nevenka Fallon, John T. Microorganisms Article The surveillance of health care-associated infection (HAI) is an essential element of the infection control program. While whole-genome sequencing (WGS) has widely been adopted for genomic surveillance, its data processing remains to be improved. Here, we propose a three-level data processing pipeline for the precision genomic surveillance of microorganisms without prior knowledge: species identification, multi-locus sequence typing (MLST), and sub-MLST clustering. The former two are closely connected to what have widely been used in current clinical microbiology laboratories, whereas the latter one provides significantly improved resolution and accuracy in genomic surveillance. Comparing to a broadly used reference-dependent alignment/mapping method and an annotation-dependent pan-/core-genome analysis, we implemented our reference- and annotation-independent, k-mer-based, simplified workflow to a collection of Acinetobacter and Enterococcus clinical isolates for tests. By taking both single nucleotide variants and genomic structural changes into account, the optimized k-mer-based pipeline demonstrated a global view of bacterial population structure in a rapid manner and discriminated the relatedness between bacterial isolates in more detail and precision. The newly developed WGS data processing pipeline would facilitate WGS application to the precision genomic surveillance of HAI. In addition, the results from such a WGS-based analysis would be useful for the precision laboratory diagnosis of infectious microorganisms. MDPI 2019-09-24 /pmc/articles/PMC6843764/ /pubmed/31554234 http://dx.doi.org/10.3390/microorganisms7100388 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Huang, Weihua
Wang, Guiqing
Yin, Changhong
Chen, Donald
Dhand, Abhay
Chanza, Melissa
Dimitrova, Nevenka
Fallon, John T.
Optimizing a Whole-Genome Sequencing Data Processing Pipeline for Precision Surveillance of Health Care-Associated Infections
title Optimizing a Whole-Genome Sequencing Data Processing Pipeline for Precision Surveillance of Health Care-Associated Infections
title_full Optimizing a Whole-Genome Sequencing Data Processing Pipeline for Precision Surveillance of Health Care-Associated Infections
title_fullStr Optimizing a Whole-Genome Sequencing Data Processing Pipeline for Precision Surveillance of Health Care-Associated Infections
title_full_unstemmed Optimizing a Whole-Genome Sequencing Data Processing Pipeline for Precision Surveillance of Health Care-Associated Infections
title_short Optimizing a Whole-Genome Sequencing Data Processing Pipeline for Precision Surveillance of Health Care-Associated Infections
title_sort optimizing a whole-genome sequencing data processing pipeline for precision surveillance of health care-associated infections
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6843764/
https://www.ncbi.nlm.nih.gov/pubmed/31554234
http://dx.doi.org/10.3390/microorganisms7100388
work_keys_str_mv AT huangweihua optimizingawholegenomesequencingdataprocessingpipelineforprecisionsurveillanceofhealthcareassociatedinfections
AT wangguiqing optimizingawholegenomesequencingdataprocessingpipelineforprecisionsurveillanceofhealthcareassociatedinfections
AT yinchanghong optimizingawholegenomesequencingdataprocessingpipelineforprecisionsurveillanceofhealthcareassociatedinfections
AT chendonald optimizingawholegenomesequencingdataprocessingpipelineforprecisionsurveillanceofhealthcareassociatedinfections
AT dhandabhay optimizingawholegenomesequencingdataprocessingpipelineforprecisionsurveillanceofhealthcareassociatedinfections
AT chanzamelissa optimizingawholegenomesequencingdataprocessingpipelineforprecisionsurveillanceofhealthcareassociatedinfections
AT dimitrovanevenka optimizingawholegenomesequencingdataprocessingpipelineforprecisionsurveillanceofhealthcareassociatedinfections
AT fallonjohnt optimizingawholegenomesequencingdataprocessingpipelineforprecisionsurveillanceofhealthcareassociatedinfections