Cargando…
elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling
elPrep is a high-performance tool for preparing sequence alignment/map files for variant calling in sequencing pipelines. It can be used as a replacement for SAMtools and Picard for preparation steps such as filtering, sorting, marking duplicates, reordering contigs, and so on, while producing ident...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4504710/ https://www.ncbi.nlm.nih.gov/pubmed/26182406 http://dx.doi.org/10.1371/journal.pone.0132868 |
_version_ | 1782381511780073472 |
---|---|
author | Herzeel, Charlotte Costanza, Pascal Decap, Dries Fostier, Jan Reumers, Joke |
author_facet | Herzeel, Charlotte Costanza, Pascal Decap, Dries Fostier, Jan Reumers, Joke |
author_sort | Herzeel, Charlotte |
collection | PubMed |
description | elPrep is a high-performance tool for preparing sequence alignment/map files for variant calling in sequencing pipelines. It can be used as a replacement for SAMtools and Picard for preparation steps such as filtering, sorting, marking duplicates, reordering contigs, and so on, while producing identical results. What sets elPrep apart is its software architecture that allows executing preparation pipelines by making only a single pass through the data, no matter how many preparation steps are used in the pipeline. elPrep is designed as a multithreaded application that runs entirely in memory, avoids repeated file I/O, and merges the computation of several preparation steps to significantly speed up the execution time. For example, for a preparation pipeline of five steps on a whole-exome BAM file (NA12878), we reduce the execution time from about 1:40 hours, when using a combination of SAMtools and Picard, to about 15 minutes when using elPrep, while utilising the same server resources, here 48 threads and 23GB of RAM. For the same pipeline on whole-genome data (NA12878), elPrep reduces the runtime from 24 hours to less than 5 hours. As a typical clinical study may contain sequencing data for hundreds of patients, elPrep can remove several hundreds of hours of computing time, and thus substantially reduce analysis time and cost. |
format | Online Article Text |
id | pubmed-4504710 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-45047102015-07-17 elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling Herzeel, Charlotte Costanza, Pascal Decap, Dries Fostier, Jan Reumers, Joke PLoS One Research Article elPrep is a high-performance tool for preparing sequence alignment/map files for variant calling in sequencing pipelines. It can be used as a replacement for SAMtools and Picard for preparation steps such as filtering, sorting, marking duplicates, reordering contigs, and so on, while producing identical results. What sets elPrep apart is its software architecture that allows executing preparation pipelines by making only a single pass through the data, no matter how many preparation steps are used in the pipeline. elPrep is designed as a multithreaded application that runs entirely in memory, avoids repeated file I/O, and merges the computation of several preparation steps to significantly speed up the execution time. For example, for a preparation pipeline of five steps on a whole-exome BAM file (NA12878), we reduce the execution time from about 1:40 hours, when using a combination of SAMtools and Picard, to about 15 minutes when using elPrep, while utilising the same server resources, here 48 threads and 23GB of RAM. For the same pipeline on whole-genome data (NA12878), elPrep reduces the runtime from 24 hours to less than 5 hours. As a typical clinical study may contain sequencing data for hundreds of patients, elPrep can remove several hundreds of hours of computing time, and thus substantially reduce analysis time and cost. Public Library of Science 2015-07-16 /pmc/articles/PMC4504710/ /pubmed/26182406 http://dx.doi.org/10.1371/journal.pone.0132868 Text en © 2015 Herzeel et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Herzeel, Charlotte Costanza, Pascal Decap, Dries Fostier, Jan Reumers, Joke elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling |
title | elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling |
title_full | elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling |
title_fullStr | elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling |
title_full_unstemmed | elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling |
title_short | elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling |
title_sort | elprep: high-performance preparation of sequence alignment/map files for variant calling |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4504710/ https://www.ncbi.nlm.nih.gov/pubmed/26182406 http://dx.doi.org/10.1371/journal.pone.0132868 |
work_keys_str_mv | AT herzeelcharlotte elprephighperformancepreparationofsequencealignmentmapfilesforvariantcalling AT costanzapascal elprephighperformancepreparationofsequencealignmentmapfilesforvariantcalling AT decapdries elprephighperformancepreparationofsequencealignmentmapfilesforvariantcalling AT fostierjan elprephighperformancepreparationofsequencealignmentmapfilesforvariantcalling AT reumersjoke elprephighperformancepreparationofsequencealignmentmapfilesforvariantcalling |