Cargando…
VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data
BACKGROUND: Pre-processing of high-throughput sequencing data for immune repertoire profiling is essential to insure high quality input for downstream analysis. VDJPipe is a flexible, high-performance tool that can perform multiple pre-processing tasks with just a single pass over the data files. RE...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5637252/ https://www.ncbi.nlm.nih.gov/pubmed/29020925 http://dx.doi.org/10.1186/s12859-017-1853-z |
_version_ | 1783270592213942272 |
---|---|
author | Christley, Scott Levin, Mikhail K. Toby, Inimary T. Fonner, John M. Monson, Nancy L. Rounds, William H. Rubelt, Florian Scarborough, Walter Scheuermann, Richard H. Cowell, Lindsay G. |
author_facet | Christley, Scott Levin, Mikhail K. Toby, Inimary T. Fonner, John M. Monson, Nancy L. Rounds, William H. Rubelt, Florian Scarborough, Walter Scheuermann, Richard H. Cowell, Lindsay G. |
author_sort | Christley, Scott |
collection | PubMed |
description | BACKGROUND: Pre-processing of high-throughput sequencing data for immune repertoire profiling is essential to insure high quality input for downstream analysis. VDJPipe is a flexible, high-performance tool that can perform multiple pre-processing tasks with just a single pass over the data files. RESULTS: Processing tasks provided by VDJPipe include base composition statistics calculation, read quality statistics calculation, quality filtering, homopolymer filtering, length and nucleotide filtering, paired-read merging, barcode demultiplexing, 5′ and 3′ PCR primer matching, and duplicate reads collapsing. VDJPipe utilizes a pipeline approach whereby multiple processing steps are performed in a sequential workflow, with the output of each step passed as input to the next step automatically. The workflow is flexible enough to handle the complex barcoding schemes used in many immunosequencing experiments. Because VDJPipe is designed for computational efficiency, we evaluated this by comparing execution times with those of pRESTO, a widely-used pre-processing tool for immune repertoire sequencing data. We found that VDJPipe requires <10% of the run time required by pRESTO. CONCLUSIONS: VDJPipe is a high-performance tool that is optimized for pre-processing large immune repertoire sequencing data sets. |
format | Online Article Text |
id | pubmed-5637252 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56372522017-10-18 VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data Christley, Scott Levin, Mikhail K. Toby, Inimary T. Fonner, John M. Monson, Nancy L. Rounds, William H. Rubelt, Florian Scarborough, Walter Scheuermann, Richard H. Cowell, Lindsay G. BMC Bioinformatics Software BACKGROUND: Pre-processing of high-throughput sequencing data for immune repertoire profiling is essential to insure high quality input for downstream analysis. VDJPipe is a flexible, high-performance tool that can perform multiple pre-processing tasks with just a single pass over the data files. RESULTS: Processing tasks provided by VDJPipe include base composition statistics calculation, read quality statistics calculation, quality filtering, homopolymer filtering, length and nucleotide filtering, paired-read merging, barcode demultiplexing, 5′ and 3′ PCR primer matching, and duplicate reads collapsing. VDJPipe utilizes a pipeline approach whereby multiple processing steps are performed in a sequential workflow, with the output of each step passed as input to the next step automatically. The workflow is flexible enough to handle the complex barcoding schemes used in many immunosequencing experiments. Because VDJPipe is designed for computational efficiency, we evaluated this by comparing execution times with those of pRESTO, a widely-used pre-processing tool for immune repertoire sequencing data. We found that VDJPipe requires <10% of the run time required by pRESTO. CONCLUSIONS: VDJPipe is a high-performance tool that is optimized for pre-processing large immune repertoire sequencing data sets. BioMed Central 2017-10-11 /pmc/articles/PMC5637252/ /pubmed/29020925 http://dx.doi.org/10.1186/s12859-017-1853-z Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Christley, Scott Levin, Mikhail K. Toby, Inimary T. Fonner, John M. Monson, Nancy L. Rounds, William H. Rubelt, Florian Scarborough, Walter Scheuermann, Richard H. Cowell, Lindsay G. VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title | VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_full | VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_fullStr | VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_full_unstemmed | VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_short | VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_sort | vdjpipe: a pipelined tool for pre-processing immune repertoire sequencing data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5637252/ https://www.ncbi.nlm.nih.gov/pubmed/29020925 http://dx.doi.org/10.1186/s12859-017-1853-z |
work_keys_str_mv | AT christleyscott vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT levinmikhailk vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT tobyinimaryt vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT fonnerjohnm vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT monsonnancyl vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT roundswilliamh vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT rubeltflorian vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT scarboroughwalter vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT scheuermannrichardh vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT cowelllindsayg vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata |