Cargando…
SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation
FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only implement some of these manipulations, and not particul...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5051824/ https://www.ncbi.nlm.nih.gov/pubmed/27706213 http://dx.doi.org/10.1371/journal.pone.0163962 |
_version_ | 1782458149168480256 |
---|---|
author | Shen, Wei Le, Shuai Li, Yan Hu, Fuquan |
author_facet | Shen, Wei Le, Shuai Li, Yan Hu, Fuquan |
author_sort | Shen, Wei |
collection | PubMed |
description | FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only implement some of these manipulations, and not particularly efficiently, and some are only available for certain operating systems. Furthermore, the complicated installation process of required packages and running environments can render these programs less user friendly. This paper describes a cross-platform ultrafast comprehensive toolkit for FASTA/Q processing. SeqKit provides executable binary files for all major operating systems, including Windows, Linux, and Mac OSX, and can be directly used without any dependencies or pre-configurations. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations. SeqKit is open source and available on Github at https://github.com/shenwei356/seqkit. |
format | Online Article Text |
id | pubmed-5051824 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-50518242016-10-27 SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation Shen, Wei Le, Shuai Li, Yan Hu, Fuquan PLoS One Research Article FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only implement some of these manipulations, and not particularly efficiently, and some are only available for certain operating systems. Furthermore, the complicated installation process of required packages and running environments can render these programs less user friendly. This paper describes a cross-platform ultrafast comprehensive toolkit for FASTA/Q processing. SeqKit provides executable binary files for all major operating systems, including Windows, Linux, and Mac OSX, and can be directly used without any dependencies or pre-configurations. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations. SeqKit is open source and available on Github at https://github.com/shenwei356/seqkit. Public Library of Science 2016-10-05 /pmc/articles/PMC5051824/ /pubmed/27706213 http://dx.doi.org/10.1371/journal.pone.0163962 Text en © 2016 Shen et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Shen, Wei Le, Shuai Li, Yan Hu, Fuquan SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation |
title | SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation |
title_full | SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation |
title_fullStr | SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation |
title_full_unstemmed | SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation |
title_short | SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation |
title_sort | seqkit: a cross-platform and ultrafast toolkit for fasta/q file manipulation |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5051824/ https://www.ncbi.nlm.nih.gov/pubmed/27706213 http://dx.doi.org/10.1371/journal.pone.0163962 |
work_keys_str_mv | AT shenwei seqkitacrossplatformandultrafasttoolkitforfastaqfilemanipulation AT leshuai seqkitacrossplatformandultrafasttoolkitforfastaqfilemanipulation AT liyan seqkitacrossplatformandultrafasttoolkitforfastaqfilemanipulation AT hufuquan seqkitacrossplatformandultrafasttoolkitforfastaqfilemanipulation |