Cargando…

Identifying pathogenic processes by integrating microarray data with prior knowledge

BACKGROUND: It is of great importance to identify molecular processes and pathways that are involved in disease etiology. Although there has been an extensive use of various high-throughput methods for this task, pathogenic pathways are still not completely understood. Often the set of genes or prot...

Descripción completa

Detalles Bibliográficos
Autores principales: Nygård, Ståle, Reitan, Trond, Clancy, Trevor, Nygaard, Vegard, Bjørnstad, Johannes, Skrbic, Biljana, Tønnessen, Theis, Christensen, Geir, Hovig, Eivind
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4006456/
https://www.ncbi.nlm.nih.gov/pubmed/24758699
http://dx.doi.org/10.1186/1471-2105-15-115
_version_ 1782314218833313792
author Nygård, Ståle
Reitan, Trond
Clancy, Trevor
Nygaard, Vegard
Bjørnstad, Johannes
Skrbic, Biljana
Tønnessen, Theis
Christensen, Geir
Hovig, Eivind
author_facet Nygård, Ståle
Reitan, Trond
Clancy, Trevor
Nygaard, Vegard
Bjørnstad, Johannes
Skrbic, Biljana
Tønnessen, Theis
Christensen, Geir
Hovig, Eivind
author_sort Nygård, Ståle
collection PubMed
description BACKGROUND: It is of great importance to identify molecular processes and pathways that are involved in disease etiology. Although there has been an extensive use of various high-throughput methods for this task, pathogenic pathways are still not completely understood. Often the set of genes or proteins identified as altered in genome-wide screens show a poor overlap with canonical disease pathways. These findings are difficult to interpret, yet crucial in order to improve the understanding of the molecular processes underlying the disease progression. We present a novel method for identifying groups of connected molecules from a set of differentially expressed genes. These groups represent functional modules sharing common cellular function and involve signaling and regulatory events. Specifically, our method makes use of Bayesian statistics to identify groups of co-regulated genes based on the microarray data, where external information about molecular interactions and connections are used as priors in the group assignments. Markov chain Monte Carlo sampling is used to search for the most reliable grouping. RESULTS: Simulation results showed that the method improved the ability of identifying correct groups compared to traditional clustering, especially for small sample sizes. Applied to a microarray heart failure dataset the method found one large cluster with several genes important for the structure of the extracellular matrix and a smaller group with many genes involved in carbohydrate metabolism. The method was also applied to a microarray dataset on melanoma cancer patients with or without metastasis, where the main cluster was dominated by genes related to keratinocyte differentiation. CONCLUSION: Our method found clusters overlapping with known pathogenic processes, but also pointed to new connections extending beyond the classical pathways.
format Online
Article
Text
id pubmed-4006456
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40064562014-05-19 Identifying pathogenic processes by integrating microarray data with prior knowledge Nygård, Ståle Reitan, Trond Clancy, Trevor Nygaard, Vegard Bjørnstad, Johannes Skrbic, Biljana Tønnessen, Theis Christensen, Geir Hovig, Eivind BMC Bioinformatics Methodology Article BACKGROUND: It is of great importance to identify molecular processes and pathways that are involved in disease etiology. Although there has been an extensive use of various high-throughput methods for this task, pathogenic pathways are still not completely understood. Often the set of genes or proteins identified as altered in genome-wide screens show a poor overlap with canonical disease pathways. These findings are difficult to interpret, yet crucial in order to improve the understanding of the molecular processes underlying the disease progression. We present a novel method for identifying groups of connected molecules from a set of differentially expressed genes. These groups represent functional modules sharing common cellular function and involve signaling and regulatory events. Specifically, our method makes use of Bayesian statistics to identify groups of co-regulated genes based on the microarray data, where external information about molecular interactions and connections are used as priors in the group assignments. Markov chain Monte Carlo sampling is used to search for the most reliable grouping. RESULTS: Simulation results showed that the method improved the ability of identifying correct groups compared to traditional clustering, especially for small sample sizes. Applied to a microarray heart failure dataset the method found one large cluster with several genes important for the structure of the extracellular matrix and a smaller group with many genes involved in carbohydrate metabolism. The method was also applied to a microarray dataset on melanoma cancer patients with or without metastasis, where the main cluster was dominated by genes related to keratinocyte differentiation. CONCLUSION: Our method found clusters overlapping with known pathogenic processes, but also pointed to new connections extending beyond the classical pathways. BioMed Central 2014-04-24 /pmc/articles/PMC4006456/ /pubmed/24758699 http://dx.doi.org/10.1186/1471-2105-15-115 Text en Copyright © 2014 Nygård et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Nygård, Ståle
Reitan, Trond
Clancy, Trevor
Nygaard, Vegard
Bjørnstad, Johannes
Skrbic, Biljana
Tønnessen, Theis
Christensen, Geir
Hovig, Eivind
Identifying pathogenic processes by integrating microarray data with prior knowledge
title Identifying pathogenic processes by integrating microarray data with prior knowledge
title_full Identifying pathogenic processes by integrating microarray data with prior knowledge
title_fullStr Identifying pathogenic processes by integrating microarray data with prior knowledge
title_full_unstemmed Identifying pathogenic processes by integrating microarray data with prior knowledge
title_short Identifying pathogenic processes by integrating microarray data with prior knowledge
title_sort identifying pathogenic processes by integrating microarray data with prior knowledge
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4006456/
https://www.ncbi.nlm.nih.gov/pubmed/24758699
http://dx.doi.org/10.1186/1471-2105-15-115
work_keys_str_mv AT nygardstale identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge
AT reitantrond identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge
AT clancytrevor identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge
AT nygaardvegard identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge
AT bjørnstadjohannes identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge
AT skrbicbiljana identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge
AT tønnessentheis identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge
AT christensengeir identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge
AT hovigeivind identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge