Cargando…

Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling

BACKGROUND: Cancer is a complex disease driven by somatic genomic alterations (SGAs) that perturb signaling pathways and consequently cellular function. Identifying patterns of pathway perturbations would provide insights into common disease mechanisms shared among tumors, which is important for gui...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Vicky, Paisley, John, Lu, Xinghua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374647/
https://www.ncbi.nlm.nih.gov/pubmed/28361690
http://dx.doi.org/10.1186/s12864-017-3494-z
_version_ 1782518933531656192
author Chen, Vicky
Paisley, John
Lu, Xinghua
author_facet Chen, Vicky
Paisley, John
Lu, Xinghua
author_sort Chen, Vicky
collection PubMed
description BACKGROUND: Cancer is a complex disease driven by somatic genomic alterations (SGAs) that perturb signaling pathways and consequently cellular function. Identifying patterns of pathway perturbations would provide insights into common disease mechanisms shared among tumors, which is important for guiding treatment and predicting outcome. However, identifying perturbed pathways is challenging, because different tumors can have the same perturbed pathways that are perturbed by different SGAs. Here, we designed novel semantic representations that capture the functional similarity of distinct SGAs perturbing a common pathway in different tumors. Combining this representation with topic modeling would allow us to identify patterns in altered signaling pathways. RESULTS: We represented each gene with a vector of words describing its function, and we represented the SGAs of a tumor as a text document by pooling the words representing individual SGAs. We applied the nested hierarchical Dirichlet process (nHDP) model to a collection of tumors of 5 cancer types from TCGA. We identified topics (consisting of co-occurring words) representing the common functional themes of different SGAs. Tumors were clustered based on their topic associations, such that each cluster consists of tumors sharing common functional themes. The resulting clusters contained mixtures of cancer types, which indicates that different cancer types can share disease mechanisms. Survival analysis based on the clusters revealed significant differences in survival among the tumors of the same cancer type that were assigned to different clusters. CONCLUSIONS: The results indicate that applying topic modeling to semantic representations of tumors identifies patterns in the combinations of altered functional pathways in cancer.
format Online
Article
Text
id pubmed-5374647
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53746472017-04-03 Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling Chen, Vicky Paisley, John Lu, Xinghua BMC Genomics Research BACKGROUND: Cancer is a complex disease driven by somatic genomic alterations (SGAs) that perturb signaling pathways and consequently cellular function. Identifying patterns of pathway perturbations would provide insights into common disease mechanisms shared among tumors, which is important for guiding treatment and predicting outcome. However, identifying perturbed pathways is challenging, because different tumors can have the same perturbed pathways that are perturbed by different SGAs. Here, we designed novel semantic representations that capture the functional similarity of distinct SGAs perturbing a common pathway in different tumors. Combining this representation with topic modeling would allow us to identify patterns in altered signaling pathways. RESULTS: We represented each gene with a vector of words describing its function, and we represented the SGAs of a tumor as a text document by pooling the words representing individual SGAs. We applied the nested hierarchical Dirichlet process (nHDP) model to a collection of tumors of 5 cancer types from TCGA. We identified topics (consisting of co-occurring words) representing the common functional themes of different SGAs. Tumors were clustered based on their topic associations, such that each cluster consists of tumors sharing common functional themes. The resulting clusters contained mixtures of cancer types, which indicates that different cancer types can share disease mechanisms. Survival analysis based on the clusters revealed significant differences in survival among the tumors of the same cancer type that were assigned to different clusters. CONCLUSIONS: The results indicate that applying topic modeling to semantic representations of tumors identifies patterns in the combinations of altered functional pathways in cancer. BioMed Central 2017-03-14 /pmc/articles/PMC5374647/ /pubmed/28361690 http://dx.doi.org/10.1186/s12864-017-3494-z Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chen, Vicky
Paisley, John
Lu, Xinghua
Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling
title Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling
title_full Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling
title_fullStr Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling
title_full_unstemmed Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling
title_short Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling
title_sort revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374647/
https://www.ncbi.nlm.nih.gov/pubmed/28361690
http://dx.doi.org/10.1186/s12864-017-3494-z
work_keys_str_mv AT chenvicky revealingcommondiseasemechanismssharedbytumorsofdifferenttissuesoforiginthroughsemanticrepresentationofgenomicalterationsandtopicmodeling
AT paisleyjohn revealingcommondiseasemechanismssharedbytumorsofdifferenttissuesoforiginthroughsemanticrepresentationofgenomicalterationsandtopicmodeling
AT luxinghua revealingcommondiseasemechanismssharedbytumorsofdifferenttissuesoforiginthroughsemanticrepresentationofgenomicalterationsandtopicmodeling