Cargando…

Genic insights from integrated human proteomics in GeneCards

GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and class...

Descripción completa

Detalles Bibliográficos
Autores principales: Fishilevich, Simon, Zimmerman, Shahar, Kohn, Asher, Iny Stein, Tsippi, Olender, Tsviya, Kolker, Eugene, Safran, Marilyn, Lancet, Doron
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4820835/
https://www.ncbi.nlm.nih.gov/pubmed/27048349
http://dx.doi.org/10.1093/database/baw030
_version_ 1782425474865037312
author Fishilevich, Simon
Zimmerman, Shahar
Kohn, Asher
Iny Stein, Tsippi
Olender, Tsviya
Kolker, Eugene
Safran, Marilyn
Lancet, Doron
author_facet Fishilevich, Simon
Zimmerman, Shahar
Kohn, Asher
Iny Stein, Tsippi
Olender, Tsviya
Kolker, Eugene
Safran, Marilyn
Lancet, Doron
author_sort Fishilevich, Simon
collection PubMed
description GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite’s next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein–RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL: http://www.genecards.org/
format Online
Article
Text
id pubmed-4820835
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-48208352016-04-06 Genic insights from integrated human proteomics in GeneCards Fishilevich, Simon Zimmerman, Shahar Kohn, Asher Iny Stein, Tsippi Olender, Tsviya Kolker, Eugene Safran, Marilyn Lancet, Doron Database (Oxford) Original Article GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite’s next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein–RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL: http://www.genecards.org/ Oxford University Press 2016-03-28 /pmc/articles/PMC4820835/ /pubmed/27048349 http://dx.doi.org/10.1093/database/baw030 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Fishilevich, Simon
Zimmerman, Shahar
Kohn, Asher
Iny Stein, Tsippi
Olender, Tsviya
Kolker, Eugene
Safran, Marilyn
Lancet, Doron
Genic insights from integrated human proteomics in GeneCards
title Genic insights from integrated human proteomics in GeneCards
title_full Genic insights from integrated human proteomics in GeneCards
title_fullStr Genic insights from integrated human proteomics in GeneCards
title_full_unstemmed Genic insights from integrated human proteomics in GeneCards
title_short Genic insights from integrated human proteomics in GeneCards
title_sort genic insights from integrated human proteomics in genecards
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4820835/
https://www.ncbi.nlm.nih.gov/pubmed/27048349
http://dx.doi.org/10.1093/database/baw030
work_keys_str_mv AT fishilevichsimon genicinsightsfromintegratedhumanproteomicsingenecards
AT zimmermanshahar genicinsightsfromintegratedhumanproteomicsingenecards
AT kohnasher genicinsightsfromintegratedhumanproteomicsingenecards
AT inysteintsippi genicinsightsfromintegratedhumanproteomicsingenecards
AT olendertsviya genicinsightsfromintegratedhumanproteomicsingenecards
AT kolkereugene genicinsightsfromintegratedhumanproteomicsingenecards
AT safranmarilyn genicinsightsfromintegratedhumanproteomicsingenecards
AT lancetdoron genicinsightsfromintegratedhumanproteomicsingenecards