Cargando…
ACO:lossless quality score compression based on adaptive coding order
BACKGROUND: With the rapid development of high-throughput sequencing technology, the cost of whole genome sequencing drops rapidly, which leads to an exponential growth of genome data. How to efficiently compress the DNA data generated by large-scale genome projects has become an important factor re...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9175485/ https://www.ncbi.nlm.nih.gov/pubmed/35672665 http://dx.doi.org/10.1186/s12859-022-04712-z |
_version_ | 1784722466578366464 |
---|---|
author | Niu, Yi Ma, Mingming Li, Fu Liu, Xianming Shi, Guangming |
author_facet | Niu, Yi Ma, Mingming Li, Fu Liu, Xianming Shi, Guangming |
author_sort | Niu, Yi |
collection | PubMed |
description | BACKGROUND: With the rapid development of high-throughput sequencing technology, the cost of whole genome sequencing drops rapidly, which leads to an exponential growth of genome data. How to efficiently compress the DNA data generated by large-scale genome projects has become an important factor restricting the further development of the DNA sequencing industry. Although the compression of DNA bases has achieved significant improvement in recent years, the compression of quality score is still challenging. RESULTS: In this paper, by reinvestigating the inherent correlations between the quality score and the sequencing process, we propose a novel lossless quality score compressor based on adaptive coding order (ACO). The main objective of ACO is to traverse the quality score adaptively in the most correlative trajectory according to the sequencing process. By cooperating with the adaptive arithmetic coding and an improved in-context strategy, ACO achieves the state-of-the-art quality score compression performances with moderate complexity for the next-generation sequencing (NGS) data. CONCLUSIONS: The competence enables ACO to serve as a candidate tool for quality score compression, ACO has been employed by AVS(Audio Video coding Standard Workgroup of China) and is freely available at https://github.com/Yoniming/ACO. |
format | Online Article Text |
id | pubmed-9175485 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-91754852022-06-09 ACO:lossless quality score compression based on adaptive coding order Niu, Yi Ma, Mingming Li, Fu Liu, Xianming Shi, Guangming BMC Bioinformatics Research BACKGROUND: With the rapid development of high-throughput sequencing technology, the cost of whole genome sequencing drops rapidly, which leads to an exponential growth of genome data. How to efficiently compress the DNA data generated by large-scale genome projects has become an important factor restricting the further development of the DNA sequencing industry. Although the compression of DNA bases has achieved significant improvement in recent years, the compression of quality score is still challenging. RESULTS: In this paper, by reinvestigating the inherent correlations between the quality score and the sequencing process, we propose a novel lossless quality score compressor based on adaptive coding order (ACO). The main objective of ACO is to traverse the quality score adaptively in the most correlative trajectory according to the sequencing process. By cooperating with the adaptive arithmetic coding and an improved in-context strategy, ACO achieves the state-of-the-art quality score compression performances with moderate complexity for the next-generation sequencing (NGS) data. CONCLUSIONS: The competence enables ACO to serve as a candidate tool for quality score compression, ACO has been employed by AVS(Audio Video coding Standard Workgroup of China) and is freely available at https://github.com/Yoniming/ACO. BioMed Central 2022-06-07 /pmc/articles/PMC9175485/ /pubmed/35672665 http://dx.doi.org/10.1186/s12859-022-04712-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Niu, Yi Ma, Mingming Li, Fu Liu, Xianming Shi, Guangming ACO:lossless quality score compression based on adaptive coding order |
title | ACO:lossless quality score compression based on adaptive coding order |
title_full | ACO:lossless quality score compression based on adaptive coding order |
title_fullStr | ACO:lossless quality score compression based on adaptive coding order |
title_full_unstemmed | ACO:lossless quality score compression based on adaptive coding order |
title_short | ACO:lossless quality score compression based on adaptive coding order |
title_sort | aco:lossless quality score compression based on adaptive coding order |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9175485/ https://www.ncbi.nlm.nih.gov/pubmed/35672665 http://dx.doi.org/10.1186/s12859-022-04712-z |
work_keys_str_mv | AT niuyi acolosslessqualityscorecompressionbasedonadaptivecodingorder AT mamingming acolosslessqualityscorecompressionbasedonadaptivecodingorder AT lifu acolosslessqualityscorecompressionbasedonadaptivecodingorder AT liuxianming acolosslessqualityscorecompressionbasedonadaptivecodingorder AT shiguangming acolosslessqualityscorecompressionbasedonadaptivecodingorder |