Cargando…

Enhancing knowledge discovery from cancer genomics data with Galaxy

The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remain...

Descripción completa

Detalles Bibliográficos
Autores principales: Albuquerque, Marco A., Grande, Bruno M., Ritch, Elie J., Pararajalingam, Prasath, Jessa, Selin, Krzywinski, Martin, Grewal, Jasleen K., Shah, Sohrab P., Boutros, Paul C., Morin, Ryan D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5437943/
https://www.ncbi.nlm.nih.gov/pubmed/28327945
http://dx.doi.org/10.1093/gigascience/gix015
_version_ 1783237677513965568
author Albuquerque, Marco A.
Grande, Bruno M.
Ritch, Elie J.
Pararajalingam, Prasath
Jessa, Selin
Krzywinski, Martin
Grewal, Jasleen K.
Shah, Sohrab P.
Boutros, Paul C.
Morin, Ryan D.
author_facet Albuquerque, Marco A.
Grande, Bruno M.
Ritch, Elie J.
Pararajalingam, Prasath
Jessa, Selin
Krzywinski, Martin
Grewal, Jasleen K.
Shah, Sohrab P.
Boutros, Paul C.
Morin, Ryan D.
author_sort Albuquerque, Marco A.
collection PubMed
description The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker.
format Online
Article
Text
id pubmed-5437943
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-54379432017-06-19 Enhancing knowledge discovery from cancer genomics data with Galaxy Albuquerque, Marco A. Grande, Bruno M. Ritch, Elie J. Pararajalingam, Prasath Jessa, Selin Krzywinski, Martin Grewal, Jasleen K. Shah, Sohrab P. Boutros, Paul C. Morin, Ryan D. Gigascience Technical Note The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. Oxford University Press 2017-03-09 /pmc/articles/PMC5437943/ /pubmed/28327945 http://dx.doi.org/10.1093/gigascience/gix015 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Albuquerque, Marco A.
Grande, Bruno M.
Ritch, Elie J.
Pararajalingam, Prasath
Jessa, Selin
Krzywinski, Martin
Grewal, Jasleen K.
Shah, Sohrab P.
Boutros, Paul C.
Morin, Ryan D.
Enhancing knowledge discovery from cancer genomics data with Galaxy
title Enhancing knowledge discovery from cancer genomics data with Galaxy
title_full Enhancing knowledge discovery from cancer genomics data with Galaxy
title_fullStr Enhancing knowledge discovery from cancer genomics data with Galaxy
title_full_unstemmed Enhancing knowledge discovery from cancer genomics data with Galaxy
title_short Enhancing knowledge discovery from cancer genomics data with Galaxy
title_sort enhancing knowledge discovery from cancer genomics data with galaxy
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5437943/
https://www.ncbi.nlm.nih.gov/pubmed/28327945
http://dx.doi.org/10.1093/gigascience/gix015
work_keys_str_mv AT albuquerquemarcoa enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT grandebrunom enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT ritcheliej enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT pararajalingamprasath enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT jessaselin enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT krzywinskimartin enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT grewaljasleenk enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT shahsohrabp enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT boutrospaulc enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy
AT morinryand enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy