Cargando…

Joint learning improves protein abundance prediction in cancers

BACKGROUND: The classic central dogma in biology is the information flow from DNA to mRNA to protein, yet complicated regulatory mechanisms underlying protein translation often lead to weak correlations between mRNA and protein abundances. This is particularly the case in cancer samples and when eva...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Hongyang, Siddiqui, Omer, Zhang, Hongjiu, Guan, Yuanfang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929375/
https://www.ncbi.nlm.nih.gov/pubmed/31870366
http://dx.doi.org/10.1186/s12915-019-0730-9
_version_ 1783482689694728192
author Li, Hongyang
Siddiqui, Omer
Zhang, Hongjiu
Guan, Yuanfang
author_facet Li, Hongyang
Siddiqui, Omer
Zhang, Hongjiu
Guan, Yuanfang
author_sort Li, Hongyang
collection PubMed
description BACKGROUND: The classic central dogma in biology is the information flow from DNA to mRNA to protein, yet complicated regulatory mechanisms underlying protein translation often lead to weak correlations between mRNA and protein abundances. This is particularly the case in cancer samples and when evaluating the same gene across multiple samples. RESULTS: Here, we report a method for predicting proteome from transcriptome, using a training dataset provided by NCI-CPTAC and TCGA, consisting of transcriptome and proteome data from 77 breast and 105 ovarian cancer samples. First, we establish a generic model capturing the correlation between mRNA and protein abundance of a single gene. Second, we build a gene-specific model capturing the interdependencies among multiple genes in a regulatory network. Third, we create a cross-tissue model by joint learning the information of shared regulatory networks and pathways across cancer tissues. Our method ranked first in the NCI-CPTAC DREAM Proteogenomics Challenge, and the predictive performance is close to the accuracy of experimental replicates. Key functional pathways and network modules controlling the proteomic abundance in cancers were revealed, in particular metabolism-related genes. CONCLUSIONS: We present a method to predict proteome from transcriptome, leveraging data from different cancer tissues to build a trans-tissue model, and suggest how to integrate information from multiple cancers to provide a foundation for further research.
format Online
Article
Text
id pubmed-6929375
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69293752019-12-30 Joint learning improves protein abundance prediction in cancers Li, Hongyang Siddiqui, Omer Zhang, Hongjiu Guan, Yuanfang BMC Biol Methodology Article BACKGROUND: The classic central dogma in biology is the information flow from DNA to mRNA to protein, yet complicated regulatory mechanisms underlying protein translation often lead to weak correlations between mRNA and protein abundances. This is particularly the case in cancer samples and when evaluating the same gene across multiple samples. RESULTS: Here, we report a method for predicting proteome from transcriptome, using a training dataset provided by NCI-CPTAC and TCGA, consisting of transcriptome and proteome data from 77 breast and 105 ovarian cancer samples. First, we establish a generic model capturing the correlation between mRNA and protein abundance of a single gene. Second, we build a gene-specific model capturing the interdependencies among multiple genes in a regulatory network. Third, we create a cross-tissue model by joint learning the information of shared regulatory networks and pathways across cancer tissues. Our method ranked first in the NCI-CPTAC DREAM Proteogenomics Challenge, and the predictive performance is close to the accuracy of experimental replicates. Key functional pathways and network modules controlling the proteomic abundance in cancers were revealed, in particular metabolism-related genes. CONCLUSIONS: We present a method to predict proteome from transcriptome, leveraging data from different cancer tissues to build a trans-tissue model, and suggest how to integrate information from multiple cancers to provide a foundation for further research. BioMed Central 2019-12-23 /pmc/articles/PMC6929375/ /pubmed/31870366 http://dx.doi.org/10.1186/s12915-019-0730-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Li, Hongyang
Siddiqui, Omer
Zhang, Hongjiu
Guan, Yuanfang
Joint learning improves protein abundance prediction in cancers
title Joint learning improves protein abundance prediction in cancers
title_full Joint learning improves protein abundance prediction in cancers
title_fullStr Joint learning improves protein abundance prediction in cancers
title_full_unstemmed Joint learning improves protein abundance prediction in cancers
title_short Joint learning improves protein abundance prediction in cancers
title_sort joint learning improves protein abundance prediction in cancers
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929375/
https://www.ncbi.nlm.nih.gov/pubmed/31870366
http://dx.doi.org/10.1186/s12915-019-0730-9
work_keys_str_mv AT lihongyang jointlearningimprovesproteinabundancepredictionincancers
AT siddiquiomer jointlearningimprovesproteinabundancepredictionincancers
AT zhanghongjiu jointlearningimprovesproteinabundancepredictionincancers
AT guanyuanfang jointlearningimprovesproteinabundancepredictionincancers