Cargando…

A43 Translational research: NGS metagenomics into clinical diagnostics

As research next-generation sequencing (NGS) metagenomic pipelines transition to clinical diagnostics, the user-base changes from bioinformaticians to biologists, medical doctors, and lab-technicians. Besides the obvious need for benchmarking and assessment of diagnostic outcomes of the pipelines an...

Descripción completa

Detalles Bibliográficos
Autores principales: Schmitz, D, Nooij, S, Janssens, T, Cremer, J, Vennema, H, Kroneman, A, Koopmans, M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735915/
http://dx.doi.org/10.1093/ve/vez002.042
_version_ 1783450433604288512
author Schmitz, D
Nooij, S
Janssens, T
Cremer, J
Vennema, H
Kroneman, A
Koopmans, M
author_facet Schmitz, D
Nooij, S
Janssens, T
Cremer, J
Vennema, H
Kroneman, A
Koopmans, M
author_sort Schmitz, D
collection PubMed
description As research next-generation sequencing (NGS) metagenomic pipelines transition to clinical diagnostics, the user-base changes from bioinformaticians to biologists, medical doctors, and lab-technicians. Besides the obvious need for benchmarking and assessment of diagnostic outcomes of the pipelines and tools, other focus points remain: reproducibility, data immutability, user-friendliness, portability/scalability, privacy, and a clear audit trail. We have a research metagenomics pipeline that takes raw fastq files and produces annotated contigs, but it is too complicated for non-bioinformaticians. Here, we present preliminary findings in adapting this pipeline for clinical diagnostics. We used information available on relevant fora (www.bioinfo-core.org) and experiences and publications from colleague bioinformaticians in other institutes (COMPARE, UBC, and LUMC). From this information, a robust and user-friendly storage and analysis workflow was designed for non-bioinformaticians in a clinical setting. Via Conda [https://conda.io] and Docker containers [http://www.docker.com], we made our disparate pipeline processes self-contained and reproducible. Furthermore, we moved all pipeline settings into a separate JSON file. After every analysis, the pipeline settings and virtual-environment recipes will be archived (immutably) under a persistent unique identifier. This allows long-term precise reproducibility. Likewise, after every run the raw data and final products will be automatically archived, complying with data retention laws/guidelines. All the disparate processes in the pipeline are parallelized and automated via Snakemake1 (i.e. end-users need no coding skills). In addition, interactive web-reports such as MultiQC [http://multiqc.info] and Krona2 are generated automatically. By combining Snakemake, Conda, and containers, our pipeline is highly portable and easily scaled up for outbreak situations, or scaled down to reduce costs. Since patient privacy is a concern, our pipeline automatically removes human genetic data. Moreover, all source code will be stored on an internal Gitlab server, and, combined with the archived data, ensures a clear audit trail. Nevertheless, challenges remain: (1) reproducible reference databases, e.g. being able to revert to an older version to reproduce old analyses. (2) A user-friendly GUI. (3) Connecting the pipeline and NGS data to in-house LIMS. (4) Efficient long-term storage, e.g. lossless compression algorithms. Nevertheless, this work represents a step forward in making user-friendly clinical diagnostic workflows.
format Online
Article
Text
id pubmed-6735915
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-67359152019-09-16 A43 Translational research: NGS metagenomics into clinical diagnostics Schmitz, D Nooij, S Janssens, T Cremer, J Vennema, H Kroneman, A Koopmans, M Virus Evol Abstract Overview As research next-generation sequencing (NGS) metagenomic pipelines transition to clinical diagnostics, the user-base changes from bioinformaticians to biologists, medical doctors, and lab-technicians. Besides the obvious need for benchmarking and assessment of diagnostic outcomes of the pipelines and tools, other focus points remain: reproducibility, data immutability, user-friendliness, portability/scalability, privacy, and a clear audit trail. We have a research metagenomics pipeline that takes raw fastq files and produces annotated contigs, but it is too complicated for non-bioinformaticians. Here, we present preliminary findings in adapting this pipeline for clinical diagnostics. We used information available on relevant fora (www.bioinfo-core.org) and experiences and publications from colleague bioinformaticians in other institutes (COMPARE, UBC, and LUMC). From this information, a robust and user-friendly storage and analysis workflow was designed for non-bioinformaticians in a clinical setting. Via Conda [https://conda.io] and Docker containers [http://www.docker.com], we made our disparate pipeline processes self-contained and reproducible. Furthermore, we moved all pipeline settings into a separate JSON file. After every analysis, the pipeline settings and virtual-environment recipes will be archived (immutably) under a persistent unique identifier. This allows long-term precise reproducibility. Likewise, after every run the raw data and final products will be automatically archived, complying with data retention laws/guidelines. All the disparate processes in the pipeline are parallelized and automated via Snakemake1 (i.e. end-users need no coding skills). In addition, interactive web-reports such as MultiQC [http://multiqc.info] and Krona2 are generated automatically. By combining Snakemake, Conda, and containers, our pipeline is highly portable and easily scaled up for outbreak situations, or scaled down to reduce costs. Since patient privacy is a concern, our pipeline automatically removes human genetic data. Moreover, all source code will be stored on an internal Gitlab server, and, combined with the archived data, ensures a clear audit trail. Nevertheless, challenges remain: (1) reproducible reference databases, e.g. being able to revert to an older version to reproduce old analyses. (2) A user-friendly GUI. (3) Connecting the pipeline and NGS data to in-house LIMS. (4) Efficient long-term storage, e.g. lossless compression algorithms. Nevertheless, this work represents a step forward in making user-friendly clinical diagnostic workflows. Oxford University Press 2019-08-22 /pmc/articles/PMC6735915/ http://dx.doi.org/10.1093/ve/vez002.042 Text en © Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access publication distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Abstract Overview
Schmitz, D
Nooij, S
Janssens, T
Cremer, J
Vennema, H
Kroneman, A
Koopmans, M
A43 Translational research: NGS metagenomics into clinical diagnostics
title A43 Translational research: NGS metagenomics into clinical diagnostics
title_full A43 Translational research: NGS metagenomics into clinical diagnostics
title_fullStr A43 Translational research: NGS metagenomics into clinical diagnostics
title_full_unstemmed A43 Translational research: NGS metagenomics into clinical diagnostics
title_short A43 Translational research: NGS metagenomics into clinical diagnostics
title_sort a43 translational research: ngs metagenomics into clinical diagnostics
topic Abstract Overview
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735915/
http://dx.doi.org/10.1093/ve/vez002.042
work_keys_str_mv AT schmitzd a43translationalresearchngsmetagenomicsintoclinicaldiagnostics
AT nooijs a43translationalresearchngsmetagenomicsintoclinicaldiagnostics
AT janssenst a43translationalresearchngsmetagenomicsintoclinicaldiagnostics
AT cremerj a43translationalresearchngsmetagenomicsintoclinicaldiagnostics
AT vennemah a43translationalresearchngsmetagenomicsintoclinicaldiagnostics
AT kronemana a43translationalresearchngsmetagenomicsintoclinicaldiagnostics
AT koopmansm a43translationalresearchngsmetagenomicsintoclinicaldiagnostics