Cargando…

Optimizing performance of GATK workflows using Apache Arrow In-Memory data framework

BACKGROUND: Immense improvements in sequencing technologies enable producing large amounts of high throughput and cost effective next-generation sequencing (NGS) data. This data needs to be processed efficiently for further downstream analyses. Computing systems need this large amounts of data close...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ahmad, Tanveer, Ahmed, Nauman, Al-Ars, Zaid, Hofstee, H. Peter
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7677819/ https://www.ncbi.nlm.nih.gov/pubmed/33208101 http://dx.doi.org/10.1186/s12864-020-07013-y

Ejemplares similares

Recommendations for performance optimizations when using GATK3.8 and GATK4
por: Heldenbrand, Jacob R, et al.
Publicado: (2019)

SparkRA: Enabling Big Data Scalability for the GATK RNA-seq Pipeline with Apache Spark
por: Al-Ars, Zaid, et al.
Publicado: (2020)

GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller
por: Ren, Shanshan, et al.
Publicado: (2019)

SparkGA2: Production-quality memory-efficient Apache Spark based genome analysis framework
por: Mushtaq, Hamid, et al.
Publicado: (2019)

OVarFlow: a resource optimized GATK 4 based Open source Variant calling workFlow
por: Bathke, Jochen, et al.
Publicado: (2021)

GPU acceleration of Darwin read overlapper for de novo assembly of long DNA reads
por: Ahmed, Nauman, et al.
Publicado: (2020)

VC@Scale: Scalable and high-performance variant calling on cluster environments
por: Ahmad, Tanveer, et al.
Publicado: (2021)

Efficient Acceleration of the Pair-HMMs Forward Algorithm for GATK HaplotypeCaller on Graphics Processing Units
por: Ren, Shanshan, et al.
Publicado: (2018)

Correction to: Recommendations for performance optimizations when using GATK3.8 and GATK4
por: Heldenbrand, Jacob R., et al.
Publicado: (2019)

GASAL2: a GPU accelerated sequence alignment library for high-throughput NGS data
por: Ahmed, Nauman, et al.
Publicado: (2019)

The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments
por: Brouard, Jean-Simon, et al.
Publicado: (2019)

DECA: scalable XHMM exome copy-number variant calling with ADAM and Apache Spark
por: Linderman, Michael D., et al.
Publicado: (2019)

pmTM-align: scalable pairwise and multiple structure alignment with Apache Spark and OpenMP
por: Chen, Weiya, et al.
Publicado: (2020)

CaGrid Workflow Toolkit: A taverna based workflow tool for cancer grid
por: Tan, Wei, et al.
Publicado: (2010)

Agalma: an automated phylogenomics workflow
por: Dunn, Casey W, et al.
Publicado: (2013)

Comparison of GATK and DeepVariant by trio sequencing
por: Lin, Yi-Lin, et al.
Publicado: (2022)

KNIME-CDK: Workflow-driven cheminformatics
por: Beisken, Stephan, et al.
Publicado: (2013)

DOPA: GPU-based protein alignment using database and memory access optimizations
por: Hasan, Laiq, et al.
Publicado: (2011)

CDK-Taverna: an open workflow environment for cheminformatics
por: Kuhn, Thomas, et al.
Publicado: (2010)

Workflows for microarray data processing in the Kepler environment
por: Stropp, Thomas, et al.
Publicado: (2012)

Spherical: an iterative workflow for assembling metagenomic datasets
por: Hitch, Thomas C. A., et al.
Publicado: (2018)

ROBOT: A Tool for Automating Ontology Workflows
por: Jackson, Rebecca C., et al.
Publicado: (2019)

RASflow: an RNA-Seq analysis workflow with Snakemake
por: Zhang, Xiaokang, et al.
Publicado: (2020)

RIG: Recalibration and Interrelation of Genomic Sequence Data with the GATK
por: McCormick, Ryan F., et al.
Publicado: (2015)

Wildfire: distributed, Grid-enabled workflow construction and execution
por: Tang, Francis, et al.
Publicado: (2005)

MOIRAI: a compact workflow system for CAGE analysis
por: Hasegawa, Akira, et al.
Publicado: (2014)

Deploying and sharing U-Compare workflows as web services
por: Kontonatsios, Georgios, et al.
Publicado: (2013)

Cytoscape Automation: empowering workflow-based network analysis
por: Otasek, David, et al.
Publicado: (2019)

Peptimapper: proteogenomics workflow for the expert annotation of eukaryotic genomes
por: Guillot, Laetitia, et al.
Publicado: (2019)

ImmunoNodes – graphical development of complex immunoinformatics workflows
por: Schubert, Benjamin, et al.
Publicado: (2017)

An Optimized GATK4 Pipeline for Plasmodium falciparum Whole Genome Sequencing Variant Calling and Analysis
por: Niaré, Karamoko, et al.
Publicado: (2023)

An optimized GATK4 pipeline for Plasmodium falciparum whole genome sequencing variant calling and analysis
por: Niaré, Karamoko, et al.
Publicado: (2023)

Evaluation of an optimized germline exomes pipeline using BWA-MEM2 and Dragen-GATK tools
por: Alganmi, Nofe, et al.
Publicado: (2023)

BioMoby extensions to the Taverna workflow management and enactment software
por: Kawas, Edward, et al.
Publicado: (2006)

An integrated ChIP-seq analysis platform with customizable workflows
por: Giannopoulou, Eugenia G, et al.
Publicado: (2011)

New developments on the cheminformatics open workflow environment CDK-Taverna
por: Truszkowski, Andreas, et al.
Publicado: (2011)

Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support
por: Abouelhoda, Mohamed, et al.
Publicado: (2012)

Pathomx: an interactive workflow-based tool for the analysis of metabolomic data
por: Fitzpatrick, Martin A, et al.
Publicado: (2014)

From the desktop to the grid: scalable bioinformatics via workflow conversion
por: de la Garza, Luis, et al.
Publicado: (2016)

systemPipeR: NGS workflow and report generation environment
por: H. Backman, Tyler W., et al.
Publicado: (2016)

Cannot write session to /tmp/vufind_sessions/sess_ks2lk8l8ml0cv4evauo5qs5m3d