Cargando…

GIFtS: annotation landscape analysis with GeneCards

BACKGROUND: Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards(® )is a gene-centric comp...

Descripción completa

Detalles Bibliográficos
Autores principales: Harel, Arye, Inger, Aron, Stelzer, Gil, Strichman-Almashanu, Liora, Dalah, Irina, Safran, Marilyn, Lancet, Doron
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774327/
https://www.ncbi.nlm.nih.gov/pubmed/19852797
http://dx.doi.org/10.1186/1471-2105-10-348
_version_ 1782173930590568448
author Harel, Arye
Inger, Aron
Stelzer, Gil
Strichman-Almashanu, Liora
Dalah, Irina
Safran, Marilyn
Lancet, Doron
author_facet Harel, Arye
Inger, Aron
Stelzer, Gil
Strichman-Almashanu, Liora
Dalah, Irina
Safran, Marilyn
Lancet, Doron
author_sort Harel, Arye
collection PubMed
description BACKGROUND: Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards(® )is a gene-centric compendium of rich annotative information for over 50,000 human gene entries, building upon 68 data sources, including Gene Ontology (GO), pathways, interactions, phenotypes, publications and many more. RESULTS: We present the GeneCards Inferred Functionality Score (GIFtS) which allows a quantitative assessment of a gene's annotation status, by exploiting the unique wealth and diversity of GeneCards information. The GIFtS tool, linked from the GeneCards home page, facilitates browsing the human genome by searching for the annotation level of a specified gene, retrieving a list of genes within a specified range of GIFtS value, obtaining random genes with a specific GIFtS value, and experimenting with the GIFtS weighting algorithm for a variety of annotation categories. The bimodal shape of the GIFtS distribution suggests a division of the human gene repertoire into two main groups: the high-GIFtS peak consists almost entirely of protein-coding genes; the low-GIFtS peak consists of genes from all of the categories. Cluster analysis of GIFtS annotation vectors provides the classification of gene groups by detailed positioning in the annotation arena. GIFtS also provide measures which enable the evaluation of the databases that serve as GeneCards sources. An inverse correlation is found (for GIFtS>25) between the number of genes annotated by each source, and the average GIFtS value of genes associated with that source. Three typical source prototypes are revealed by their GIFtS distribution: genome-wide sources, sources comprising mainly highly annotated genes, and sources comprising mainly poorly annotated genes. The degree of accumulated knowledge for a given gene measured by GIFtS was correlated (for GIFtS>30) with the number of publications for a gene, and with the seniority of this entry in the HGNC database. CONCLUSION: GIFtS can be a valuable tool for computational procedures which analyze lists of large set of genes resulting from wet-lab or computational research. GIFtS may also assist the scientific community with identification of groups of uncharacterized genes for diverse applications, such as delineation of novel functions and charting unexplored areas of the human genome.
format Text
id pubmed-2774327
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27743272009-11-07 GIFtS: annotation landscape analysis with GeneCards Harel, Arye Inger, Aron Stelzer, Gil Strichman-Almashanu, Liora Dalah, Irina Safran, Marilyn Lancet, Doron BMC Bioinformatics Research Article BACKGROUND: Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards(® )is a gene-centric compendium of rich annotative information for over 50,000 human gene entries, building upon 68 data sources, including Gene Ontology (GO), pathways, interactions, phenotypes, publications and many more. RESULTS: We present the GeneCards Inferred Functionality Score (GIFtS) which allows a quantitative assessment of a gene's annotation status, by exploiting the unique wealth and diversity of GeneCards information. The GIFtS tool, linked from the GeneCards home page, facilitates browsing the human genome by searching for the annotation level of a specified gene, retrieving a list of genes within a specified range of GIFtS value, obtaining random genes with a specific GIFtS value, and experimenting with the GIFtS weighting algorithm for a variety of annotation categories. The bimodal shape of the GIFtS distribution suggests a division of the human gene repertoire into two main groups: the high-GIFtS peak consists almost entirely of protein-coding genes; the low-GIFtS peak consists of genes from all of the categories. Cluster analysis of GIFtS annotation vectors provides the classification of gene groups by detailed positioning in the annotation arena. GIFtS also provide measures which enable the evaluation of the databases that serve as GeneCards sources. An inverse correlation is found (for GIFtS>25) between the number of genes annotated by each source, and the average GIFtS value of genes associated with that source. Three typical source prototypes are revealed by their GIFtS distribution: genome-wide sources, sources comprising mainly highly annotated genes, and sources comprising mainly poorly annotated genes. The degree of accumulated knowledge for a given gene measured by GIFtS was correlated (for GIFtS>30) with the number of publications for a gene, and with the seniority of this entry in the HGNC database. CONCLUSION: GIFtS can be a valuable tool for computational procedures which analyze lists of large set of genes resulting from wet-lab or computational research. GIFtS may also assist the scientific community with identification of groups of uncharacterized genes for diverse applications, such as delineation of novel functions and charting unexplored areas of the human genome. BioMed Central 2009-10-23 /pmc/articles/PMC2774327/ /pubmed/19852797 http://dx.doi.org/10.1186/1471-2105-10-348 Text en Copyright © 2009 Harel et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Harel, Arye
Inger, Aron
Stelzer, Gil
Strichman-Almashanu, Liora
Dalah, Irina
Safran, Marilyn
Lancet, Doron
GIFtS: annotation landscape analysis with GeneCards
title GIFtS: annotation landscape analysis with GeneCards
title_full GIFtS: annotation landscape analysis with GeneCards
title_fullStr GIFtS: annotation landscape analysis with GeneCards
title_full_unstemmed GIFtS: annotation landscape analysis with GeneCards
title_short GIFtS: annotation landscape analysis with GeneCards
title_sort gifts: annotation landscape analysis with genecards
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774327/
https://www.ncbi.nlm.nih.gov/pubmed/19852797
http://dx.doi.org/10.1186/1471-2105-10-348
work_keys_str_mv AT harelarye giftsannotationlandscapeanalysiswithgenecards
AT ingeraron giftsannotationlandscapeanalysiswithgenecards
AT stelzergil giftsannotationlandscapeanalysiswithgenecards
AT strichmanalmashanuliora giftsannotationlandscapeanalysiswithgenecards
AT dalahirina giftsannotationlandscapeanalysiswithgenecards
AT safranmarilyn giftsannotationlandscapeanalysiswithgenecards
AT lancetdoron giftsannotationlandscapeanalysiswithgenecards