Cargando…
Genic insights from integrated human proteomics in GeneCards
GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and class...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4820835/ https://www.ncbi.nlm.nih.gov/pubmed/27048349 http://dx.doi.org/10.1093/database/baw030 |
_version_ | 1782425474865037312 |
---|---|
author | Fishilevich, Simon Zimmerman, Shahar Kohn, Asher Iny Stein, Tsippi Olender, Tsviya Kolker, Eugene Safran, Marilyn Lancet, Doron |
author_facet | Fishilevich, Simon Zimmerman, Shahar Kohn, Asher Iny Stein, Tsippi Olender, Tsviya Kolker, Eugene Safran, Marilyn Lancet, Doron |
author_sort | Fishilevich, Simon |
collection | PubMed |
description | GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite’s next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein–RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL: http://www.genecards.org/ |
format | Online Article Text |
id | pubmed-4820835 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-48208352016-04-06 Genic insights from integrated human proteomics in GeneCards Fishilevich, Simon Zimmerman, Shahar Kohn, Asher Iny Stein, Tsippi Olender, Tsviya Kolker, Eugene Safran, Marilyn Lancet, Doron Database (Oxford) Original Article GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite’s next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein–RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL: http://www.genecards.org/ Oxford University Press 2016-03-28 /pmc/articles/PMC4820835/ /pubmed/27048349 http://dx.doi.org/10.1093/database/baw030 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Fishilevich, Simon Zimmerman, Shahar Kohn, Asher Iny Stein, Tsippi Olender, Tsviya Kolker, Eugene Safran, Marilyn Lancet, Doron Genic insights from integrated human proteomics in GeneCards |
title | Genic insights from integrated human proteomics in GeneCards |
title_full | Genic insights from integrated human proteomics in GeneCards |
title_fullStr | Genic insights from integrated human proteomics in GeneCards |
title_full_unstemmed | Genic insights from integrated human proteomics in GeneCards |
title_short | Genic insights from integrated human proteomics in GeneCards |
title_sort | genic insights from integrated human proteomics in genecards |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4820835/ https://www.ncbi.nlm.nih.gov/pubmed/27048349 http://dx.doi.org/10.1093/database/baw030 |
work_keys_str_mv | AT fishilevichsimon genicinsightsfromintegratedhumanproteomicsingenecards AT zimmermanshahar genicinsightsfromintegratedhumanproteomicsingenecards AT kohnasher genicinsightsfromintegratedhumanproteomicsingenecards AT inysteintsippi genicinsightsfromintegratedhumanproteomicsingenecards AT olendertsviya genicinsightsfromintegratedhumanproteomicsingenecards AT kolkereugene genicinsightsfromintegratedhumanproteomicsingenecards AT safranmarilyn genicinsightsfromintegratedhumanproteomicsingenecards AT lancetdoron genicinsightsfromintegratedhumanproteomicsingenecards |