Cargando…
Identifying pathogenic processes by integrating microarray data with prior knowledge
BACKGROUND: It is of great importance to identify molecular processes and pathways that are involved in disease etiology. Although there has been an extensive use of various high-throughput methods for this task, pathogenic pathways are still not completely understood. Often the set of genes or prot...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4006456/ https://www.ncbi.nlm.nih.gov/pubmed/24758699 http://dx.doi.org/10.1186/1471-2105-15-115 |
_version_ | 1782314218833313792 |
---|---|
author | Nygård, Ståle Reitan, Trond Clancy, Trevor Nygaard, Vegard Bjørnstad, Johannes Skrbic, Biljana Tønnessen, Theis Christensen, Geir Hovig, Eivind |
author_facet | Nygård, Ståle Reitan, Trond Clancy, Trevor Nygaard, Vegard Bjørnstad, Johannes Skrbic, Biljana Tønnessen, Theis Christensen, Geir Hovig, Eivind |
author_sort | Nygård, Ståle |
collection | PubMed |
description | BACKGROUND: It is of great importance to identify molecular processes and pathways that are involved in disease etiology. Although there has been an extensive use of various high-throughput methods for this task, pathogenic pathways are still not completely understood. Often the set of genes or proteins identified as altered in genome-wide screens show a poor overlap with canonical disease pathways. These findings are difficult to interpret, yet crucial in order to improve the understanding of the molecular processes underlying the disease progression. We present a novel method for identifying groups of connected molecules from a set of differentially expressed genes. These groups represent functional modules sharing common cellular function and involve signaling and regulatory events. Specifically, our method makes use of Bayesian statistics to identify groups of co-regulated genes based on the microarray data, where external information about molecular interactions and connections are used as priors in the group assignments. Markov chain Monte Carlo sampling is used to search for the most reliable grouping. RESULTS: Simulation results showed that the method improved the ability of identifying correct groups compared to traditional clustering, especially for small sample sizes. Applied to a microarray heart failure dataset the method found one large cluster with several genes important for the structure of the extracellular matrix and a smaller group with many genes involved in carbohydrate metabolism. The method was also applied to a microarray dataset on melanoma cancer patients with or without metastasis, where the main cluster was dominated by genes related to keratinocyte differentiation. CONCLUSION: Our method found clusters overlapping with known pathogenic processes, but also pointed to new connections extending beyond the classical pathways. |
format | Online Article Text |
id | pubmed-4006456 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40064562014-05-19 Identifying pathogenic processes by integrating microarray data with prior knowledge Nygård, Ståle Reitan, Trond Clancy, Trevor Nygaard, Vegard Bjørnstad, Johannes Skrbic, Biljana Tønnessen, Theis Christensen, Geir Hovig, Eivind BMC Bioinformatics Methodology Article BACKGROUND: It is of great importance to identify molecular processes and pathways that are involved in disease etiology. Although there has been an extensive use of various high-throughput methods for this task, pathogenic pathways are still not completely understood. Often the set of genes or proteins identified as altered in genome-wide screens show a poor overlap with canonical disease pathways. These findings are difficult to interpret, yet crucial in order to improve the understanding of the molecular processes underlying the disease progression. We present a novel method for identifying groups of connected molecules from a set of differentially expressed genes. These groups represent functional modules sharing common cellular function and involve signaling and regulatory events. Specifically, our method makes use of Bayesian statistics to identify groups of co-regulated genes based on the microarray data, where external information about molecular interactions and connections are used as priors in the group assignments. Markov chain Monte Carlo sampling is used to search for the most reliable grouping. RESULTS: Simulation results showed that the method improved the ability of identifying correct groups compared to traditional clustering, especially for small sample sizes. Applied to a microarray heart failure dataset the method found one large cluster with several genes important for the structure of the extracellular matrix and a smaller group with many genes involved in carbohydrate metabolism. The method was also applied to a microarray dataset on melanoma cancer patients with or without metastasis, where the main cluster was dominated by genes related to keratinocyte differentiation. CONCLUSION: Our method found clusters overlapping with known pathogenic processes, but also pointed to new connections extending beyond the classical pathways. BioMed Central 2014-04-24 /pmc/articles/PMC4006456/ /pubmed/24758699 http://dx.doi.org/10.1186/1471-2105-15-115 Text en Copyright © 2014 Nygård et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Nygård, Ståle Reitan, Trond Clancy, Trevor Nygaard, Vegard Bjørnstad, Johannes Skrbic, Biljana Tønnessen, Theis Christensen, Geir Hovig, Eivind Identifying pathogenic processes by integrating microarray data with prior knowledge |
title | Identifying pathogenic processes by integrating microarray data with prior knowledge |
title_full | Identifying pathogenic processes by integrating microarray data with prior knowledge |
title_fullStr | Identifying pathogenic processes by integrating microarray data with prior knowledge |
title_full_unstemmed | Identifying pathogenic processes by integrating microarray data with prior knowledge |
title_short | Identifying pathogenic processes by integrating microarray data with prior knowledge |
title_sort | identifying pathogenic processes by integrating microarray data with prior knowledge |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4006456/ https://www.ncbi.nlm.nih.gov/pubmed/24758699 http://dx.doi.org/10.1186/1471-2105-15-115 |
work_keys_str_mv | AT nygardstale identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge AT reitantrond identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge AT clancytrevor identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge AT nygaardvegard identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge AT bjørnstadjohannes identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge AT skrbicbiljana identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge AT tønnessentheis identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge AT christensengeir identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge AT hovigeivind identifyingpathogenicprocessesbyintegratingmicroarraydatawithpriorknowledge |