Cargando…

ACO:lossless quality score compression based on adaptive coding order

BACKGROUND: With the rapid development of high-throughput sequencing technology, the cost of whole genome sequencing drops rapidly, which leads to an exponential growth of genome data. How to efficiently compress the DNA data generated by large-scale genome projects has become an important factor re...

Descripción completa

Detalles Bibliográficos
Autores principales: Niu, Yi, Ma, Mingming, Li, Fu, Liu, Xianming, Shi, Guangming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9175485/
https://www.ncbi.nlm.nih.gov/pubmed/35672665
http://dx.doi.org/10.1186/s12859-022-04712-z
_version_ 1784722466578366464
author Niu, Yi
Ma, Mingming
Li, Fu
Liu, Xianming
Shi, Guangming
author_facet Niu, Yi
Ma, Mingming
Li, Fu
Liu, Xianming
Shi, Guangming
author_sort Niu, Yi
collection PubMed
description BACKGROUND: With the rapid development of high-throughput sequencing technology, the cost of whole genome sequencing drops rapidly, which leads to an exponential growth of genome data. How to efficiently compress the DNA data generated by large-scale genome projects has become an important factor restricting the further development of the DNA sequencing industry. Although the compression of DNA bases has achieved significant improvement in recent years, the compression of quality score is still challenging. RESULTS: In this paper, by reinvestigating the inherent correlations between the quality score and the sequencing process, we propose a novel lossless quality score compressor based on adaptive coding order (ACO). The main objective of ACO is to traverse the quality score adaptively in the most correlative trajectory according to the sequencing process. By cooperating with the adaptive arithmetic coding and an improved in-context strategy, ACO achieves the state-of-the-art quality score compression performances with moderate complexity for the next-generation sequencing (NGS) data. CONCLUSIONS: The competence enables ACO to serve as a candidate tool for quality score compression, ACO has been employed by AVS(Audio Video coding Standard Workgroup of China) and is freely available at https://github.com/Yoniming/ACO.
format Online
Article
Text
id pubmed-9175485
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91754852022-06-09 ACO:lossless quality score compression based on adaptive coding order Niu, Yi Ma, Mingming Li, Fu Liu, Xianming Shi, Guangming BMC Bioinformatics Research BACKGROUND: With the rapid development of high-throughput sequencing technology, the cost of whole genome sequencing drops rapidly, which leads to an exponential growth of genome data. How to efficiently compress the DNA data generated by large-scale genome projects has become an important factor restricting the further development of the DNA sequencing industry. Although the compression of DNA bases has achieved significant improvement in recent years, the compression of quality score is still challenging. RESULTS: In this paper, by reinvestigating the inherent correlations between the quality score and the sequencing process, we propose a novel lossless quality score compressor based on adaptive coding order (ACO). The main objective of ACO is to traverse the quality score adaptively in the most correlative trajectory according to the sequencing process. By cooperating with the adaptive arithmetic coding and an improved in-context strategy, ACO achieves the state-of-the-art quality score compression performances with moderate complexity for the next-generation sequencing (NGS) data. CONCLUSIONS: The competence enables ACO to serve as a candidate tool for quality score compression, ACO has been employed by AVS(Audio Video coding Standard Workgroup of China) and is freely available at https://github.com/Yoniming/ACO. BioMed Central 2022-06-07 /pmc/articles/PMC9175485/ /pubmed/35672665 http://dx.doi.org/10.1186/s12859-022-04712-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Niu, Yi
Ma, Mingming
Li, Fu
Liu, Xianming
Shi, Guangming
ACO:lossless quality score compression based on adaptive coding order
title ACO:lossless quality score compression based on adaptive coding order
title_full ACO:lossless quality score compression based on adaptive coding order
title_fullStr ACO:lossless quality score compression based on adaptive coding order
title_full_unstemmed ACO:lossless quality score compression based on adaptive coding order
title_short ACO:lossless quality score compression based on adaptive coding order
title_sort aco:lossless quality score compression based on adaptive coding order
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9175485/
https://www.ncbi.nlm.nih.gov/pubmed/35672665
http://dx.doi.org/10.1186/s12859-022-04712-z
work_keys_str_mv AT niuyi acolosslessqualityscorecompressionbasedonadaptivecodingorder
AT mamingming acolosslessqualityscorecompressionbasedonadaptivecodingorder
AT lifu acolosslessqualityscorecompressionbasedonadaptivecodingorder
AT liuxianming acolosslessqualityscorecompressionbasedonadaptivecodingorder
AT shiguangming acolosslessqualityscorecompressionbasedonadaptivecodingorder