Cargando…
Interspecies gene function prediction using semantic similarity
BACKGROUND: Gene Ontology (GO) is a collaborative project that maintains and develops controlled vocabulary (or terms) to describe the molecular function, biological roles and cellular location of gene products in a hierarchical ontology. GO also provides GO annotations that associate genes with GO...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260010/ https://www.ncbi.nlm.nih.gov/pubmed/28155711 http://dx.doi.org/10.1186/s12918-016-0361-5 |
_version_ | 1782499323317059584 |
---|---|
author | Yu, Guoxian Luo, Wei Fu, Guangyuan Wang, Jun |
author_facet | Yu, Guoxian Luo, Wei Fu, Guangyuan Wang, Jun |
author_sort | Yu, Guoxian |
collection | PubMed |
description | BACKGROUND: Gene Ontology (GO) is a collaborative project that maintains and develops controlled vocabulary (or terms) to describe the molecular function, biological roles and cellular location of gene products in a hierarchical ontology. GO also provides GO annotations that associate genes with GO terms. GO consortium independently and collaboratively annotate terms to gene products, mainly from model organisms (or species) they are interested in. Due to experiment ethics, research interests of biologists and resources limitations, homologous genes from different species currently are annotated with different terms. These differences can be more attributed to incomplete annotations of genes than to functional difference between them. RESULTS: Semantic similarity between genes is derived from GO hierarchy and annotations of genes. It is positively correlated with the similarity derived from various types of biological data and has been applied to predict gene function. In this paper, we investigate whether it is possible to replenish annotations of incompletely annotated genes by using semantic similarity between genes from two species with homology. For this investigation, we utilize three representative semantic similarity metrics to compute similarity between genes from two species. Next, we determine the k nearest neighborhood genes from the two species based on the chosen metric and then use terms annotated to k neighbors of a gene to replenish annotations of that gene. We perform experiments on archived (from Jan-2014 to Jan-2016) GO annotations of four species (Human, Mouse, Danio rerio and Arabidopsis thaliana) to assess the contribution of semantic similarity between genes from different species. The experimental results demonstrate that: (1) semantic similarity between genes from homologous species contributes much more on the improved accuracy (by 53.22%) than genes from single species alone, and genes from two species with low homology; (2) GO annotations of genes from homologous species are complementary to each other. CONCLUSIONS: Our study shows that semantic similarity based interspecies gene function annotation from homologous species is more prominent than traditional intraspecies approaches. This work can promote more research on semantic similarity based function prediction across species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-016-0361-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5260010 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52600102017-01-26 Interspecies gene function prediction using semantic similarity Yu, Guoxian Luo, Wei Fu, Guangyuan Wang, Jun BMC Syst Biol Research BACKGROUND: Gene Ontology (GO) is a collaborative project that maintains and develops controlled vocabulary (or terms) to describe the molecular function, biological roles and cellular location of gene products in a hierarchical ontology. GO also provides GO annotations that associate genes with GO terms. GO consortium independently and collaboratively annotate terms to gene products, mainly from model organisms (or species) they are interested in. Due to experiment ethics, research interests of biologists and resources limitations, homologous genes from different species currently are annotated with different terms. These differences can be more attributed to incomplete annotations of genes than to functional difference between them. RESULTS: Semantic similarity between genes is derived from GO hierarchy and annotations of genes. It is positively correlated with the similarity derived from various types of biological data and has been applied to predict gene function. In this paper, we investigate whether it is possible to replenish annotations of incompletely annotated genes by using semantic similarity between genes from two species with homology. For this investigation, we utilize three representative semantic similarity metrics to compute similarity between genes from two species. Next, we determine the k nearest neighborhood genes from the two species based on the chosen metric and then use terms annotated to k neighbors of a gene to replenish annotations of that gene. We perform experiments on archived (from Jan-2014 to Jan-2016) GO annotations of four species (Human, Mouse, Danio rerio and Arabidopsis thaliana) to assess the contribution of semantic similarity between genes from different species. The experimental results demonstrate that: (1) semantic similarity between genes from homologous species contributes much more on the improved accuracy (by 53.22%) than genes from single species alone, and genes from two species with low homology; (2) GO annotations of genes from homologous species are complementary to each other. CONCLUSIONS: Our study shows that semantic similarity based interspecies gene function annotation from homologous species is more prominent than traditional intraspecies approaches. This work can promote more research on semantic similarity based function prediction across species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-016-0361-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-12-23 /pmc/articles/PMC5260010/ /pubmed/28155711 http://dx.doi.org/10.1186/s12918-016-0361-5 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Yu, Guoxian Luo, Wei Fu, Guangyuan Wang, Jun Interspecies gene function prediction using semantic similarity |
title | Interspecies gene function prediction using semantic similarity |
title_full | Interspecies gene function prediction using semantic similarity |
title_fullStr | Interspecies gene function prediction using semantic similarity |
title_full_unstemmed | Interspecies gene function prediction using semantic similarity |
title_short | Interspecies gene function prediction using semantic similarity |
title_sort | interspecies gene function prediction using semantic similarity |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260010/ https://www.ncbi.nlm.nih.gov/pubmed/28155711 http://dx.doi.org/10.1186/s12918-016-0361-5 |
work_keys_str_mv | AT yuguoxian interspeciesgenefunctionpredictionusingsemanticsimilarity AT luowei interspeciesgenefunctionpredictionusingsemanticsimilarity AT fuguangyuan interspeciesgenefunctionpredictionusingsemanticsimilarity AT wangjun interspeciesgenefunctionpredictionusingsemanticsimilarity |