Cargando…

KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation

BACKGROUND: The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. RESULTS: To meet increasing demands for comparing la...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Dapeng, Xu, Jiayue, Yu, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4573299/
https://www.ncbi.nlm.nih.gov/pubmed/26376976
http://dx.doi.org/10.1186/s13062-015-0083-4
_version_ 1782390469058101248
author Wang, Dapeng
Xu, Jiayue
Yu, Jun
author_facet Wang, Dapeng
Xu, Jiayue
Yu, Jun
author_sort Wang, Dapeng
collection PubMed
description BACKGROUND: The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. RESULTS: To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK (http://kgcak.big.ac.cn/KGCAK/), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. CONCLUSION: We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data. REVIEWERS: This article was reviewed by Prof Mark Ragan and Dr Yuri Wolf.
format Online
Article
Text
id pubmed-4573299
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45732992015-09-18 KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation Wang, Dapeng Xu, Jiayue Yu, Jun Biol Direct Application Note BACKGROUND: The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. RESULTS: To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK (http://kgcak.big.ac.cn/KGCAK/), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. CONCLUSION: We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data. REVIEWERS: This article was reviewed by Prof Mark Ragan and Dr Yuri Wolf. BioMed Central 2015-09-16 /pmc/articles/PMC4573299/ /pubmed/26376976 http://dx.doi.org/10.1186/s13062-015-0083-4 Text en © Wang et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Application Note
Wang, Dapeng
Xu, Jiayue
Yu, Jun
KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation
title KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation
title_full KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation
title_fullStr KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation
title_full_unstemmed KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation
title_short KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation
title_sort kgcak: a k-mer based database for genome-wide phylogeny and complexity evaluation
topic Application Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4573299/
https://www.ncbi.nlm.nih.gov/pubmed/26376976
http://dx.doi.org/10.1186/s13062-015-0083-4
work_keys_str_mv AT wangdapeng kgcakakmerbaseddatabaseforgenomewidephylogenyandcomplexityevaluation
AT xujiayue kgcakakmerbaseddatabaseforgenomewidephylogenyandcomplexityevaluation
AT yujun kgcakakmerbaseddatabaseforgenomewidephylogenyandcomplexityevaluation