Cargando…
fastp: an ultra-fast all-in-one FASTQ preprocessor
MOTIVATION: Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most a...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129281/ https://www.ncbi.nlm.nih.gov/pubmed/30423086 http://dx.doi.org/10.1093/bioinformatics/bty560 |
_version_ | 1783353773143359488 |
---|---|
author | Chen, Shifu Zhou, Yanqing Chen, Yaru Gu, Jia |
author_facet | Chen, Shifu Zhou, Yanqing Chen, Yaru Gu, Jia |
author_sort | Chen, Shifu |
collection | PubMed |
description | MOTIVATION: Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most are developed using high-level programming languages (e.g. Python and Java) and provide limited multi-threading support. Reading and loading data multiple times also renders preprocessing slow and I/O inefficient. RESULTS: We developed fastp as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features. It can perform quality control, adapter trimming, quality filtering, per-read quality pruning and many other operations with a single scan of the FASTQ data. This tool is developed in C++ and has multi-threading support. Based on our evaluation, fastp is 2–5 times faster than other FASTQ preprocessing tools such as Trimmomatic or Cutadapt despite performing far more operations than similar tools. AVAILABILITY AND IMPLEMENTATION: The open-source code and corresponding instructions are available at https://github.com/OpenGene/fastp. |
format | Online Article Text |
id | pubmed-6129281 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-61292812018-09-12 fastp: an ultra-fast all-in-one FASTQ preprocessor Chen, Shifu Zhou, Yanqing Chen, Yaru Gu, Jia Bioinformatics Eccb 2018: European Conference on Computational Biology Proceedings MOTIVATION: Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most are developed using high-level programming languages (e.g. Python and Java) and provide limited multi-threading support. Reading and loading data multiple times also renders preprocessing slow and I/O inefficient. RESULTS: We developed fastp as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features. It can perform quality control, adapter trimming, quality filtering, per-read quality pruning and many other operations with a single scan of the FASTQ data. This tool is developed in C++ and has multi-threading support. Based on our evaluation, fastp is 2–5 times faster than other FASTQ preprocessing tools such as Trimmomatic or Cutadapt despite performing far more operations than similar tools. AVAILABILITY AND IMPLEMENTATION: The open-source code and corresponding instructions are available at https://github.com/OpenGene/fastp. Oxford University Press 2018-09-01 2018-09-08 /pmc/articles/PMC6129281/ /pubmed/30423086 http://dx.doi.org/10.1093/bioinformatics/bty560 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Eccb 2018: European Conference on Computational Biology Proceedings Chen, Shifu Zhou, Yanqing Chen, Yaru Gu, Jia fastp: an ultra-fast all-in-one FASTQ preprocessor |
title | fastp: an ultra-fast all-in-one FASTQ preprocessor |
title_full | fastp: an ultra-fast all-in-one FASTQ preprocessor |
title_fullStr | fastp: an ultra-fast all-in-one FASTQ preprocessor |
title_full_unstemmed | fastp: an ultra-fast all-in-one FASTQ preprocessor |
title_short | fastp: an ultra-fast all-in-one FASTQ preprocessor |
title_sort | fastp: an ultra-fast all-in-one fastq preprocessor |
topic | Eccb 2018: European Conference on Computational Biology Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129281/ https://www.ncbi.nlm.nih.gov/pubmed/30423086 http://dx.doi.org/10.1093/bioinformatics/bty560 |
work_keys_str_mv | AT chenshifu fastpanultrafastallinonefastqpreprocessor AT zhouyanqing fastpanultrafastallinonefastqpreprocessor AT chenyaru fastpanultrafastallinonefastqpreprocessor AT gujia fastpanultrafastallinonefastqpreprocessor |