Cargando…
Enhancing knowledge discovery from cancer genomics data with Galaxy
The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remain...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5437943/ https://www.ncbi.nlm.nih.gov/pubmed/28327945 http://dx.doi.org/10.1093/gigascience/gix015 |
_version_ | 1783237677513965568 |
---|---|
author | Albuquerque, Marco A. Grande, Bruno M. Ritch, Elie J. Pararajalingam, Prasath Jessa, Selin Krzywinski, Martin Grewal, Jasleen K. Shah, Sohrab P. Boutros, Paul C. Morin, Ryan D. |
author_facet | Albuquerque, Marco A. Grande, Bruno M. Ritch, Elie J. Pararajalingam, Prasath Jessa, Selin Krzywinski, Martin Grewal, Jasleen K. Shah, Sohrab P. Boutros, Paul C. Morin, Ryan D. |
author_sort | Albuquerque, Marco A. |
collection | PubMed |
description | The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. |
format | Online Article Text |
id | pubmed-5437943 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-54379432017-06-19 Enhancing knowledge discovery from cancer genomics data with Galaxy Albuquerque, Marco A. Grande, Bruno M. Ritch, Elie J. Pararajalingam, Prasath Jessa, Selin Krzywinski, Martin Grewal, Jasleen K. Shah, Sohrab P. Boutros, Paul C. Morin, Ryan D. Gigascience Technical Note The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. Oxford University Press 2017-03-09 /pmc/articles/PMC5437943/ /pubmed/28327945 http://dx.doi.org/10.1093/gigascience/gix015 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Albuquerque, Marco A. Grande, Bruno M. Ritch, Elie J. Pararajalingam, Prasath Jessa, Selin Krzywinski, Martin Grewal, Jasleen K. Shah, Sohrab P. Boutros, Paul C. Morin, Ryan D. Enhancing knowledge discovery from cancer genomics data with Galaxy |
title | Enhancing knowledge discovery from cancer genomics data with Galaxy |
title_full | Enhancing knowledge discovery from cancer genomics data with Galaxy |
title_fullStr | Enhancing knowledge discovery from cancer genomics data with Galaxy |
title_full_unstemmed | Enhancing knowledge discovery from cancer genomics data with Galaxy |
title_short | Enhancing knowledge discovery from cancer genomics data with Galaxy |
title_sort | enhancing knowledge discovery from cancer genomics data with galaxy |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5437943/ https://www.ncbi.nlm.nih.gov/pubmed/28327945 http://dx.doi.org/10.1093/gigascience/gix015 |
work_keys_str_mv | AT albuquerquemarcoa enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT grandebrunom enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT ritcheliej enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT pararajalingamprasath enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT jessaselin enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT krzywinskimartin enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT grewaljasleenk enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT shahsohrabp enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT boutrospaulc enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy AT morinryand enhancingknowledgediscoveryfromcancergenomicsdatawithgalaxy |