Cargando…

Human mitochondrial genome compression using machine learning techniques

BACKGROUND: In recent years, with the development of high-throughput genome sequencing technologies, a large amount of genome data has been generated, which has caused widespread concern about data storage and transmission costs. However, how to effectively compression genome sequences data remains...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Rongjie, Zang, Tianyi, Wang, Yadong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6805717/
https://www.ncbi.nlm.nih.gov/pubmed/31639043
http://dx.doi.org/10.1186/s40246-019-0225-3
Descripción
Sumario:BACKGROUND: In recent years, with the development of high-throughput genome sequencing technologies, a large amount of genome data has been generated, which has caused widespread concern about data storage and transmission costs. However, how to effectively compression genome sequences data remains an unsolved problem. RESULTS: In this paper, we propose a compression method using machine learning techniques (DeepDNA), for compressing human mitochondrial genome data. The experimental results show the effectiveness of our proposed method compared with other on the human mitochondrial genome data. CONCLUSIONS: The compression method we proposed can be classified as non-reference based method, but the compression effect is comparable to that of reference based methods. Moreover, our method not only have a well compression results in the population genome with large redundancy, but also in the single genome with small redundancy. The codes of DeepDNA are available at https://github.com/rongjiewang/DeepDNA.