Cargando…
LFastqC: A lossless non-reference-based FASTQ compressor
The cost-effectiveness of next-generation sequencing (NGS) has led to the advancement of genomic research, thereby regularly generating a large amount of raw data that often requires efficient infrastructures such as data centers to manage the storage and transmission of such data. The generated NGS...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6855649/ https://www.ncbi.nlm.nih.gov/pubmed/31725736 http://dx.doi.org/10.1371/journal.pone.0224806 |
_version_ | 1783470445603848192 |
---|---|
author | Al Yami, Sultan Huang, Chun-Hsi |
author_facet | Al Yami, Sultan Huang, Chun-Hsi |
author_sort | Al Yami, Sultan |
collection | PubMed |
description | The cost-effectiveness of next-generation sequencing (NGS) has led to the advancement of genomic research, thereby regularly generating a large amount of raw data that often requires efficient infrastructures such as data centers to manage the storage and transmission of such data. The generated NGS data are highly redundant and need to be efficiently compressed to reduce the cost of storage space and transmission bandwidth. We present a lossless, non-reference-based FASTQ compression algorithm, known as LFastqC, an improvement over the LFQC tool, to address these issues. LFastqC is compared with several state-of-the-art compressors, and the results indicate that LFastqC achieves better compression ratios for important datasets such as the LS454, PacBio, and MinION. Moreover, LFastqC has a better compression and decompression speed than LFQC, which was previously the top-performing compression algorithm for the LS454 dataset. LFastqC is freely available at https://github.uconn.edu/sya12005/LFastqC. |
format | Online Article Text |
id | pubmed-6855649 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-68556492019-12-06 LFastqC: A lossless non-reference-based FASTQ compressor Al Yami, Sultan Huang, Chun-Hsi PLoS One Research Article The cost-effectiveness of next-generation sequencing (NGS) has led to the advancement of genomic research, thereby regularly generating a large amount of raw data that often requires efficient infrastructures such as data centers to manage the storage and transmission of such data. The generated NGS data are highly redundant and need to be efficiently compressed to reduce the cost of storage space and transmission bandwidth. We present a lossless, non-reference-based FASTQ compression algorithm, known as LFastqC, an improvement over the LFQC tool, to address these issues. LFastqC is compared with several state-of-the-art compressors, and the results indicate that LFastqC achieves better compression ratios for important datasets such as the LS454, PacBio, and MinION. Moreover, LFastqC has a better compression and decompression speed than LFQC, which was previously the top-performing compression algorithm for the LS454 dataset. LFastqC is freely available at https://github.uconn.edu/sya12005/LFastqC. Public Library of Science 2019-11-14 /pmc/articles/PMC6855649/ /pubmed/31725736 http://dx.doi.org/10.1371/journal.pone.0224806 Text en © 2019 Al Yami, Huang http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Al Yami, Sultan Huang, Chun-Hsi LFastqC: A lossless non-reference-based FASTQ compressor |
title | LFastqC: A lossless non-reference-based FASTQ compressor |
title_full | LFastqC: A lossless non-reference-based FASTQ compressor |
title_fullStr | LFastqC: A lossless non-reference-based FASTQ compressor |
title_full_unstemmed | LFastqC: A lossless non-reference-based FASTQ compressor |
title_short | LFastqC: A lossless non-reference-based FASTQ compressor |
title_sort | lfastqc: a lossless non-reference-based fastq compressor |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6855649/ https://www.ncbi.nlm.nih.gov/pubmed/31725736 http://dx.doi.org/10.1371/journal.pone.0224806 |
work_keys_str_mv | AT alyamisultan lfastqcalosslessnonreferencebasedfastqcompressor AT huangchunhsi lfastqcalosslessnonreferencebasedfastqcompressor |