Cargando…
KEGG_Extractor: An Effective Extraction Tool for KEGG Orthologs
The KEGG Orthology (KO) database is a widely used molecular function reference database which can be used to conduct functional annotation of most microorganisms. At present, there are many KEGG tools based on the KO entries for annotating functional orthologs. However, determining how to efficientl...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9956942/ https://www.ncbi.nlm.nih.gov/pubmed/36833314 http://dx.doi.org/10.3390/genes14020386 |
Sumario: | The KEGG Orthology (KO) database is a widely used molecular function reference database which can be used to conduct functional annotation of most microorganisms. At present, there are many KEGG tools based on the KO entries for annotating functional orthologs. However, determining how to efficiently extract and sort the annotation results of KEGG still hinders the subsequent genome analysis. There is a lack of effective measures used to quickly extract and classify the gene sequences and species information of the KEGG annotations. Here, we present a supporting tool: KEGG_Extractor for species-specific genes extraction and classification, which can output the results through an iterative keyword matching algorithm. It can not only extract and classify the amino acid sequences, but also the nucleotide sequences, and it has proved to be fast and efficient for microbial analysis. Analysis of the ancient Wood Ljungdahl (WL) pathway through the KEGG_Extractor reveals that ~226 archaeal strains contained the WL pathway-related genes. Most of them were Methanococcus maripaludis, Methanosarcina mazei and members of the Methanobacterium, Thermococcus and Methanosarcina genus. Using the KEGG_Extractor, the ARWL database was constructed, which had a high accuracy and complement. This tool helps to link genes with the KEGG pathway and promote the reconstruction of molecular networks. Availability and implementation: KEGG_Extractor is freely available from the GitHub. |
---|