Cargando…

Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer

Background: The first step of virtually all next generation sequencing analysis involves the splitting of the raw sequencing data into separate files using sample-specific barcodes, a process known as “demultiplexing”. However, we found that existing software for this purpose was either too inflexib...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wilkins, Oscar G, Capitanchik, Charlotte, Luscombe, Nicholas M., Ule, Jernej
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	F1000 Research Limited 2021
Materias:	Software Tool Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8287537/ https://www.ncbi.nlm.nih.gov/pubmed/34286104 http://dx.doi.org/10.12688/wellcomeopenres.16791.1

_version_	1783723926909616128
author	Wilkins, Oscar G Capitanchik, Charlotte Luscombe, Nicholas M. Ule, Jernej
author_facet	Wilkins, Oscar G Capitanchik, Charlotte Luscombe, Nicholas M. Ule, Jernej
author_sort	Wilkins, Oscar G
collection	PubMed
description	Background: The first step of virtually all next generation sequencing analysis involves the splitting of the raw sequencing data into separate files using sample-specific barcodes, a process known as “demultiplexing”. However, we found that existing software for this purpose was either too inflexible or too computationally intensive for fast, streamlined processing of raw, single end fastq files containing combinatorial barcodes. Results: Here, we introduce a fast and uniquely flexible demultiplexer, named Ultraplex, which splits a raw FASTQ file containing barcodes either at a single end or at both 5’ and 3’ ends of reads, trims the sequencing adaptors and low-quality bases, and moves unique molecular identifiers (UMIs) into the read header, allowing subsequent removal of PCR duplicates. Ultraplex is able to perform such single or combinatorial demultiplexing on both single- and paired-end sequencing data, and can process an entire Illumina HiSeq lane, consisting of nearly 500 million reads, in less than 20 minutes. Conclusions: Ultraplex greatly reduces computational burden and pipeline complexity for the demultiplexing of complex sequencing libraries, such as those produced by various CLIP and ribosome profiling protocols, and is also very user friendly, enabling streamlined, robust data processing. Ultraplex is available on PyPi and Conda and via Github.
format	Online Article Text
id	pubmed-8287537
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	F1000 Research Limited
record_format	MEDLINE/PubMed
spelling	pubmed-82875372021-07-19 Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer Wilkins, Oscar G Capitanchik, Charlotte Luscombe, Nicholas M. Ule, Jernej Wellcome Open Res Software Tool Article Background: The first step of virtually all next generation sequencing analysis involves the splitting of the raw sequencing data into separate files using sample-specific barcodes, a process known as “demultiplexing”. However, we found that existing software for this purpose was either too inflexible or too computationally intensive for fast, streamlined processing of raw, single end fastq files containing combinatorial barcodes. Results: Here, we introduce a fast and uniquely flexible demultiplexer, named Ultraplex, which splits a raw FASTQ file containing barcodes either at a single end or at both 5’ and 3’ ends of reads, trims the sequencing adaptors and low-quality bases, and moves unique molecular identifiers (UMIs) into the read header, allowing subsequent removal of PCR duplicates. Ultraplex is able to perform such single or combinatorial demultiplexing on both single- and paired-end sequencing data, and can process an entire Illumina HiSeq lane, consisting of nearly 500 million reads, in less than 20 minutes. Conclusions: Ultraplex greatly reduces computational burden and pipeline complexity for the demultiplexing of complex sequencing libraries, such as those produced by various CLIP and ribosome profiling protocols, and is also very user friendly, enabling streamlined, robust data processing. Ultraplex is available on PyPi and Conda and via Github. F1000 Research Limited 2021-06-07 /pmc/articles/PMC8287537/ /pubmed/34286104 http://dx.doi.org/10.12688/wellcomeopenres.16791.1 Text en Copyright: © 2021 Wilkins OG et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software Tool Article Wilkins, Oscar G Capitanchik, Charlotte Luscombe, Nicholas M. Ule, Jernej Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer
title	Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer
title_full	Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer
title_fullStr	Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer
title_full_unstemmed	Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer
title_short	Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer
title_sort	ultraplex: a rapid, flexible, all-in-one fastq demultiplexer
topic	Software Tool Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8287537/ https://www.ncbi.nlm.nih.gov/pubmed/34286104 http://dx.doi.org/10.12688/wellcomeopenres.16791.1
work_keys_str_mv	AT wilkinsoscarg ultraplexarapidflexibleallinonefastqdemultiplexer AT capitanchikcharlotte ultraplexarapidflexibleallinonefastqdemultiplexer AT luscombenicholasm ultraplexarapidflexibleallinonefastqdemultiplexer AT ulejernej ultraplexarapidflexibleallinonefastqdemultiplexer

Ultraplex: A rapid, flexible, all-in-one fastq demultiplexer

Ejemplares similares