Cargando…

KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation

BACKGROUND: The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. RESULTS: To meet increasing demands for comparing la...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Dapeng, Xu, Jiayue, Yu, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4573299/
https://www.ncbi.nlm.nih.gov/pubmed/26376976
http://dx.doi.org/10.1186/s13062-015-0083-4
Descripción
Sumario:BACKGROUND: The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. RESULTS: To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK (http://kgcak.big.ac.cn/KGCAK/), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. CONCLUSION: We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data. REVIEWERS: This article was reviewed by Prof Mark Ragan and Dr Yuri Wolf.