Cargando…

Use of application containers and workflows for genomic data analysis

BACKGROUND: The rapid acquisition of biological data and development of computationally intensive analyses has led to a need for novel approaches to software deployment. In particular, the complexity of common analytic tools for genomics makes them difficult to deploy and decreases the reproducibili...

Descripción completa

Detalles Bibliográficos
Autores principales: Schulz, Wade L., Durant, Thomas J. S., Siddon, Alexa J., Torres, Richard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Medknow Publications & Media Pvt Ltd 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5248400/
https://www.ncbi.nlm.nih.gov/pubmed/28163975
http://dx.doi.org/10.4103/2153-3539.197197
_version_ 1782497258286088192
author Schulz, Wade L.
Durant, Thomas J. S.
Siddon, Alexa J.
Torres, Richard
author_facet Schulz, Wade L.
Durant, Thomas J. S.
Siddon, Alexa J.
Torres, Richard
author_sort Schulz, Wade L.
collection PubMed
description BACKGROUND: The rapid acquisition of biological data and development of computationally intensive analyses has led to a need for novel approaches to software deployment. In particular, the complexity of common analytic tools for genomics makes them difficult to deploy and decreases the reproducibility of computational experiments. METHODS: Recent technologies that allow for application virtualization, such as Docker, allow developers and bioinformaticians to isolate these applications and deploy secure, scalable platforms that have the potential to dramatically increase the efficiency of big data processing. RESULTS: While limitations exist, this study demonstrates a successful implementation of a pipeline with several discrete software applications for the analysis of next-generation sequencing (NGS) data. CONCLUSIONS: With this approach, we significantly reduced the amount of time needed to perform clonal analysis from NGS data in acute myeloid leukemia.
format Online
Article
Text
id pubmed-5248400
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Medknow Publications & Media Pvt Ltd
record_format MEDLINE/PubMed
spelling pubmed-52484002017-02-03 Use of application containers and workflows for genomic data analysis Schulz, Wade L. Durant, Thomas J. S. Siddon, Alexa J. Torres, Richard J Pathol Inform Technical Note BACKGROUND: The rapid acquisition of biological data and development of computationally intensive analyses has led to a need for novel approaches to software deployment. In particular, the complexity of common analytic tools for genomics makes them difficult to deploy and decreases the reproducibility of computational experiments. METHODS: Recent technologies that allow for application virtualization, such as Docker, allow developers and bioinformaticians to isolate these applications and deploy secure, scalable platforms that have the potential to dramatically increase the efficiency of big data processing. RESULTS: While limitations exist, this study demonstrates a successful implementation of a pipeline with several discrete software applications for the analysis of next-generation sequencing (NGS) data. CONCLUSIONS: With this approach, we significantly reduced the amount of time needed to perform clonal analysis from NGS data in acute myeloid leukemia. Medknow Publications & Media Pvt Ltd 2016-12-30 /pmc/articles/PMC5248400/ /pubmed/28163975 http://dx.doi.org/10.4103/2153-3539.197197 Text en Copyright: © 2016 Journal of Pathology Informatics http://creativecommons.org/licenses/by-nc-sa/3.0 This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as the author is credited and the new creations are licensed under the identical terms.
spellingShingle Technical Note
Schulz, Wade L.
Durant, Thomas J. S.
Siddon, Alexa J.
Torres, Richard
Use of application containers and workflows for genomic data analysis
title Use of application containers and workflows for genomic data analysis
title_full Use of application containers and workflows for genomic data analysis
title_fullStr Use of application containers and workflows for genomic data analysis
title_full_unstemmed Use of application containers and workflows for genomic data analysis
title_short Use of application containers and workflows for genomic data analysis
title_sort use of application containers and workflows for genomic data analysis
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5248400/
https://www.ncbi.nlm.nih.gov/pubmed/28163975
http://dx.doi.org/10.4103/2153-3539.197197
work_keys_str_mv AT schulzwadel useofapplicationcontainersandworkflowsforgenomicdataanalysis
AT durantthomasjs useofapplicationcontainersandworkflowsforgenomicdataanalysis
AT siddonalexaj useofapplicationcontainersandworkflowsforgenomicdataanalysis
AT torresrichard useofapplicationcontainersandworkflowsforgenomicdataanalysis