Cargando…
Canary: an atomic pipeline for clinical amplicon assays
BACKGROUND: High throughput sequencing requires bioinformatics pipelines to process large volumes of data into meaningful variants that can be translated into a clinical report. These pipelines often suffer from a number of shortcomings: they lack robustness and have many components written in multi...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5732437/ https://www.ncbi.nlm.nih.gov/pubmed/29246107 http://dx.doi.org/10.1186/s12859-017-1950-z |
_version_ | 1783286697294823424 |
---|---|
author | Doig, Kenneth D. Ellul, Jason Fellowes, Andrew Thompson, Ella R. Ryland, Georgina Blombery, Piers Papenfuss, Anthony T. Fox, Stephen B. |
author_facet | Doig, Kenneth D. Ellul, Jason Fellowes, Andrew Thompson, Ella R. Ryland, Georgina Blombery, Piers Papenfuss, Anthony T. Fox, Stephen B. |
author_sort | Doig, Kenneth D. |
collection | PubMed |
description | BACKGROUND: High throughput sequencing requires bioinformatics pipelines to process large volumes of data into meaningful variants that can be translated into a clinical report. These pipelines often suffer from a number of shortcomings: they lack robustness and have many components written in multiple languages, each with a variety of resource requirements. Pipeline components must be linked together with a workflow system to achieve the processing of FASTQ files through to a VCF file of variants. Crafting these pipelines requires considerable bioinformatics and IT skills beyond the reach of many clinical laboratories. RESULTS: Here we present Canary, a single program that can be run on a laptop, which takes FASTQ files from amplicon assays through to an annotated VCF file ready for clinical analysis. Canary can be installed and run with a single command using Docker containerization or run as a single JAR file on a wide range of platforms. Although it is a single utility, Canary performs all the functions present in more complex and unwieldy pipelines. All variants identified by Canary are 3′ shifted and represented in their most parsimonious form to provide a consistent nomenclature, irrespective of sequencing variation. Further, proximate in-phase variants are represented as a single HGVS ‘delins’ variant. This allows for correct nomenclature and consequences to be ascribed to complex multi-nucleotide polymorphisms (MNPs), which are otherwise difficult to represent and interpret. Variants can also be annotated with hundreds of attributes sourced from MyVariant.info to give up to date details on pathogenicity, population statistics and in-silico predictors. CONCLUSIONS: Canary has been used at the Peter MacCallum Cancer Centre in Melbourne for the last 2 years for the processing of clinical sequencing data. By encapsulating clinical features in a single, easily installed executable, Canary makes sequencing more accessible to all pathology laboratories. Canary is available for download as source or a Docker image at https://github.com/PapenfussLab/Canary under a GPL-3.0 License. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1950-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5732437 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-57324372017-12-21 Canary: an atomic pipeline for clinical amplicon assays Doig, Kenneth D. Ellul, Jason Fellowes, Andrew Thompson, Ella R. Ryland, Georgina Blombery, Piers Papenfuss, Anthony T. Fox, Stephen B. BMC Bioinformatics Software BACKGROUND: High throughput sequencing requires bioinformatics pipelines to process large volumes of data into meaningful variants that can be translated into a clinical report. These pipelines often suffer from a number of shortcomings: they lack robustness and have many components written in multiple languages, each with a variety of resource requirements. Pipeline components must be linked together with a workflow system to achieve the processing of FASTQ files through to a VCF file of variants. Crafting these pipelines requires considerable bioinformatics and IT skills beyond the reach of many clinical laboratories. RESULTS: Here we present Canary, a single program that can be run on a laptop, which takes FASTQ files from amplicon assays through to an annotated VCF file ready for clinical analysis. Canary can be installed and run with a single command using Docker containerization or run as a single JAR file on a wide range of platforms. Although it is a single utility, Canary performs all the functions present in more complex and unwieldy pipelines. All variants identified by Canary are 3′ shifted and represented in their most parsimonious form to provide a consistent nomenclature, irrespective of sequencing variation. Further, proximate in-phase variants are represented as a single HGVS ‘delins’ variant. This allows for correct nomenclature and consequences to be ascribed to complex multi-nucleotide polymorphisms (MNPs), which are otherwise difficult to represent and interpret. Variants can also be annotated with hundreds of attributes sourced from MyVariant.info to give up to date details on pathogenicity, population statistics and in-silico predictors. CONCLUSIONS: Canary has been used at the Peter MacCallum Cancer Centre in Melbourne for the last 2 years for the processing of clinical sequencing data. By encapsulating clinical features in a single, easily installed executable, Canary makes sequencing more accessible to all pathology laboratories. Canary is available for download as source or a Docker image at https://github.com/PapenfussLab/Canary under a GPL-3.0 License. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1950-z) contains supplementary material, which is available to authorized users. BioMed Central 2017-12-15 /pmc/articles/PMC5732437/ /pubmed/29246107 http://dx.doi.org/10.1186/s12859-017-1950-z Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Doig, Kenneth D. Ellul, Jason Fellowes, Andrew Thompson, Ella R. Ryland, Georgina Blombery, Piers Papenfuss, Anthony T. Fox, Stephen B. Canary: an atomic pipeline for clinical amplicon assays |
title | Canary: an atomic pipeline for clinical amplicon assays |
title_full | Canary: an atomic pipeline for clinical amplicon assays |
title_fullStr | Canary: an atomic pipeline for clinical amplicon assays |
title_full_unstemmed | Canary: an atomic pipeline for clinical amplicon assays |
title_short | Canary: an atomic pipeline for clinical amplicon assays |
title_sort | canary: an atomic pipeline for clinical amplicon assays |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5732437/ https://www.ncbi.nlm.nih.gov/pubmed/29246107 http://dx.doi.org/10.1186/s12859-017-1950-z |
work_keys_str_mv | AT doigkennethd canaryanatomicpipelineforclinicalampliconassays AT elluljason canaryanatomicpipelineforclinicalampliconassays AT fellowesandrew canaryanatomicpipelineforclinicalampliconassays AT thompsonellar canaryanatomicpipelineforclinicalampliconassays AT rylandgeorgina canaryanatomicpipelineforclinicalampliconassays AT blomberypiers canaryanatomicpipelineforclinicalampliconassays AT papenfussanthonyt canaryanatomicpipelineforclinicalampliconassays AT foxstephenb canaryanatomicpipelineforclinicalampliconassays |