Cargando…
GeneSurrounder: network-based identification of disease genes in expression data
BACKGROUND: A key challenge of identifying disease–associated genes is analyzing transcriptomic data in the context of regulatory networks that control cellular processes in order to capture multi-gene interactions and yield mechanistically interpretable results. One existing category of analysis te...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6503437/ https://www.ncbi.nlm.nih.gov/pubmed/31060502 http://dx.doi.org/10.1186/s12859-019-2829-y |
_version_ | 1783416412003368960 |
---|---|
author | Shah, Sahil D. Braun, Rosemary |
author_facet | Shah, Sahil D. Braun, Rosemary |
author_sort | Shah, Sahil D. |
collection | PubMed |
description | BACKGROUND: A key challenge of identifying disease–associated genes is analyzing transcriptomic data in the context of regulatory networks that control cellular processes in order to capture multi-gene interactions and yield mechanistically interpretable results. One existing category of analysis techniques identifies groups of related genes using interaction networks, but these gene sets often comprise tens or hundreds of genes, making experimental follow-up challenging. A more recent category of methods identifies precise gene targets while incorporating systems-level information, but these techniques do not determine whether a gene is a driving source of changes in its network, an important characteristic when looking for potential drug targets. RESULTS: We introduce GeneSurrounder, an analysis method that integrates expression data and network information in a novel procedure to detect genes that are sources of dysregulation on the network. The key idea of our method is to score genes based on the evidence that they influence the dysregulation of their neighbors on the network in a manner that impacts cell function. Applying GeneSurrounder to real expression data, we show that our method is able to identify biologically relevant genes, integrate pathway and expression data, and yield more reproducible results across multiple studies of the same phenotype than competing methods. CONCLUSIONS: Together these findings suggest that GeneSurrounder provides a new avenue for identifying individual genes that can be targeted therapeutically. The key innovation of GeneSurrounder is the combination of pathway network information with gene expression data to determine the degree to which a gene is a source of dysregulation on the network. By prioritizing genes in this way, our method provides insights into disease mechanisms and suggests diagnostic and therapeutic targets. Our method can be used to help biologists select among tens or hundreds of genes for further validation. The implementation in R is available at github.com/sahildshah1/gene-surrounder. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2829-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6503437 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-65034372019-05-10 GeneSurrounder: network-based identification of disease genes in expression data Shah, Sahil D. Braun, Rosemary BMC Bioinformatics Methodology Article BACKGROUND: A key challenge of identifying disease–associated genes is analyzing transcriptomic data in the context of regulatory networks that control cellular processes in order to capture multi-gene interactions and yield mechanistically interpretable results. One existing category of analysis techniques identifies groups of related genes using interaction networks, but these gene sets often comprise tens or hundreds of genes, making experimental follow-up challenging. A more recent category of methods identifies precise gene targets while incorporating systems-level information, but these techniques do not determine whether a gene is a driving source of changes in its network, an important characteristic when looking for potential drug targets. RESULTS: We introduce GeneSurrounder, an analysis method that integrates expression data and network information in a novel procedure to detect genes that are sources of dysregulation on the network. The key idea of our method is to score genes based on the evidence that they influence the dysregulation of their neighbors on the network in a manner that impacts cell function. Applying GeneSurrounder to real expression data, we show that our method is able to identify biologically relevant genes, integrate pathway and expression data, and yield more reproducible results across multiple studies of the same phenotype than competing methods. CONCLUSIONS: Together these findings suggest that GeneSurrounder provides a new avenue for identifying individual genes that can be targeted therapeutically. The key innovation of GeneSurrounder is the combination of pathway network information with gene expression data to determine the degree to which a gene is a source of dysregulation on the network. By prioritizing genes in this way, our method provides insights into disease mechanisms and suggests diagnostic and therapeutic targets. Our method can be used to help biologists select among tens or hundreds of genes for further validation. The implementation in R is available at github.com/sahildshah1/gene-surrounder. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2829-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-05-06 /pmc/articles/PMC6503437/ /pubmed/31060502 http://dx.doi.org/10.1186/s12859-019-2829-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Shah, Sahil D. Braun, Rosemary GeneSurrounder: network-based identification of disease genes in expression data |
title | GeneSurrounder: network-based identification of disease genes in expression data |
title_full | GeneSurrounder: network-based identification of disease genes in expression data |
title_fullStr | GeneSurrounder: network-based identification of disease genes in expression data |
title_full_unstemmed | GeneSurrounder: network-based identification of disease genes in expression data |
title_short | GeneSurrounder: network-based identification of disease genes in expression data |
title_sort | genesurrounder: network-based identification of disease genes in expression data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6503437/ https://www.ncbi.nlm.nih.gov/pubmed/31060502 http://dx.doi.org/10.1186/s12859-019-2829-y |
work_keys_str_mv | AT shahsahild genesurroundernetworkbasedidentificationofdiseasegenesinexpressiondata AT braunrosemary genesurroundernetworkbasedidentificationofdiseasegenesinexpressiondata |