Cargando…

SGFSC: speeding the gene functional similarity calculation based on hash tables

BACKGROUND: In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their t...

Descripción completa

Detalles Bibliográficos
Autores principales: Tian, Zhen, Wang, Chunyu, Guo, Maozu, Liu, Xiaoyan, Teng, Zhixia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5096311/
https://www.ncbi.nlm.nih.gov/pubmed/27814675
http://dx.doi.org/10.1186/s12859-016-1294-0
_version_ 1782465447670579200
author Tian, Zhen
Wang, Chunyu
Guo, Maozu
Liu, Xiaoyan
Teng, Zhixia
author_facet Tian, Zhen
Wang, Chunyu
Guo, Maozu
Liu, Xiaoyan
Teng, Zhixia
author_sort Tian, Zhen
collection PubMed
description BACKGROUND: In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their time consumption, especially when measuring the gene functional similarities of a large number of gene pairs. The problem of computational efficiency for pairwise approaches is even more prominent because they are dependent on the combination of semantic similarity. Therefore, the efficient measurement of gene functional similarity remains a challenging problem. RESULTS: To speed current gene functional similarity calculation methods, a novel two-step computing strategy is proposed: (1) establish a hash table for each method to store essential information obtained from the Gene Ontology (GO) graph and (2) measure gene functional similarity based on the corresponding hash table. There is no need to traverse the GO graph repeatedly for each method with the help of the hash table. The analysis of time complexity shows that the computational efficiency of these methods is significantly improved. We also implement a novel Speeding Gene Functional Similarity Calculation tool, namely SGFSC, which is bundled with seven typical measures using our proposed strategy. Further experiments show the great advantage of SGFSC in measuring gene functional similarity on the whole genomic scale. CONCLUSIONS: The proposed strategy is successful in speeding current gene functional similarity calculation methods. SGFSC is an efficient tool that is freely available at http://nclab.hit.edu.cn/SGFSC. The source code of SGFSC can be downloaded from http://pan.baidu.com/s/1dFFmvpZ.
format Online
Article
Text
id pubmed-5096311
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50963112016-11-07 SGFSC: speeding the gene functional similarity calculation based on hash tables Tian, Zhen Wang, Chunyu Guo, Maozu Liu, Xiaoyan Teng, Zhixia BMC Bioinformatics Methodology Article BACKGROUND: In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their time consumption, especially when measuring the gene functional similarities of a large number of gene pairs. The problem of computational efficiency for pairwise approaches is even more prominent because they are dependent on the combination of semantic similarity. Therefore, the efficient measurement of gene functional similarity remains a challenging problem. RESULTS: To speed current gene functional similarity calculation methods, a novel two-step computing strategy is proposed: (1) establish a hash table for each method to store essential information obtained from the Gene Ontology (GO) graph and (2) measure gene functional similarity based on the corresponding hash table. There is no need to traverse the GO graph repeatedly for each method with the help of the hash table. The analysis of time complexity shows that the computational efficiency of these methods is significantly improved. We also implement a novel Speeding Gene Functional Similarity Calculation tool, namely SGFSC, which is bundled with seven typical measures using our proposed strategy. Further experiments show the great advantage of SGFSC in measuring gene functional similarity on the whole genomic scale. CONCLUSIONS: The proposed strategy is successful in speeding current gene functional similarity calculation methods. SGFSC is an efficient tool that is freely available at http://nclab.hit.edu.cn/SGFSC. The source code of SGFSC can be downloaded from http://pan.baidu.com/s/1dFFmvpZ. BioMed Central 2016-11-04 /pmc/articles/PMC5096311/ /pubmed/27814675 http://dx.doi.org/10.1186/s12859-016-1294-0 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Tian, Zhen
Wang, Chunyu
Guo, Maozu
Liu, Xiaoyan
Teng, Zhixia
SGFSC: speeding the gene functional similarity calculation based on hash tables
title SGFSC: speeding the gene functional similarity calculation based on hash tables
title_full SGFSC: speeding the gene functional similarity calculation based on hash tables
title_fullStr SGFSC: speeding the gene functional similarity calculation based on hash tables
title_full_unstemmed SGFSC: speeding the gene functional similarity calculation based on hash tables
title_short SGFSC: speeding the gene functional similarity calculation based on hash tables
title_sort sgfsc: speeding the gene functional similarity calculation based on hash tables
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5096311/
https://www.ncbi.nlm.nih.gov/pubmed/27814675
http://dx.doi.org/10.1186/s12859-016-1294-0
work_keys_str_mv AT tianzhen sgfscspeedingthegenefunctionalsimilaritycalculationbasedonhashtables
AT wangchunyu sgfscspeedingthegenefunctionalsimilaritycalculationbasedonhashtables
AT guomaozu sgfscspeedingthegenefunctionalsimilaritycalculationbasedonhashtables
AT liuxiaoyan sgfscspeedingthegenefunctionalsimilaritycalculationbasedonhashtables
AT tengzhixia sgfscspeedingthegenefunctionalsimilaritycalculationbasedonhashtables