Cargando…

GeneSurrounder: network-based identification of disease genes in expression data

BACKGROUND: A key challenge of identifying disease–associated genes is analyzing transcriptomic data in the context of regulatory networks that control cellular processes in order to capture multi-gene interactions and yield mechanistically interpretable results. One existing category of analysis te...

Descripción completa

Detalles Bibliográficos
Autores principales: Shah, Sahil D., Braun, Rosemary
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6503437/
https://www.ncbi.nlm.nih.gov/pubmed/31060502
http://dx.doi.org/10.1186/s12859-019-2829-y
_version_ 1783416412003368960
author Shah, Sahil D.
Braun, Rosemary
author_facet Shah, Sahil D.
Braun, Rosemary
author_sort Shah, Sahil D.
collection PubMed
description BACKGROUND: A key challenge of identifying disease–associated genes is analyzing transcriptomic data in the context of regulatory networks that control cellular processes in order to capture multi-gene interactions and yield mechanistically interpretable results. One existing category of analysis techniques identifies groups of related genes using interaction networks, but these gene sets often comprise tens or hundreds of genes, making experimental follow-up challenging. A more recent category of methods identifies precise gene targets while incorporating systems-level information, but these techniques do not determine whether a gene is a driving source of changes in its network, an important characteristic when looking for potential drug targets. RESULTS: We introduce GeneSurrounder, an analysis method that integrates expression data and network information in a novel procedure to detect genes that are sources of dysregulation on the network. The key idea of our method is to score genes based on the evidence that they influence the dysregulation of their neighbors on the network in a manner that impacts cell function. Applying GeneSurrounder to real expression data, we show that our method is able to identify biologically relevant genes, integrate pathway and expression data, and yield more reproducible results across multiple studies of the same phenotype than competing methods. CONCLUSIONS: Together these findings suggest that GeneSurrounder provides a new avenue for identifying individual genes that can be targeted therapeutically. The key innovation of GeneSurrounder is the combination of pathway network information with gene expression data to determine the degree to which a gene is a source of dysregulation on the network. By prioritizing genes in this way, our method provides insights into disease mechanisms and suggests diagnostic and therapeutic targets. Our method can be used to help biologists select among tens or hundreds of genes for further validation. The implementation in R is available at github.com/sahildshah1/gene-surrounder. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2829-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6503437
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65034372019-05-10 GeneSurrounder: network-based identification of disease genes in expression data Shah, Sahil D. Braun, Rosemary BMC Bioinformatics Methodology Article BACKGROUND: A key challenge of identifying disease–associated genes is analyzing transcriptomic data in the context of regulatory networks that control cellular processes in order to capture multi-gene interactions and yield mechanistically interpretable results. One existing category of analysis techniques identifies groups of related genes using interaction networks, but these gene sets often comprise tens or hundreds of genes, making experimental follow-up challenging. A more recent category of methods identifies precise gene targets while incorporating systems-level information, but these techniques do not determine whether a gene is a driving source of changes in its network, an important characteristic when looking for potential drug targets. RESULTS: We introduce GeneSurrounder, an analysis method that integrates expression data and network information in a novel procedure to detect genes that are sources of dysregulation on the network. The key idea of our method is to score genes based on the evidence that they influence the dysregulation of their neighbors on the network in a manner that impacts cell function. Applying GeneSurrounder to real expression data, we show that our method is able to identify biologically relevant genes, integrate pathway and expression data, and yield more reproducible results across multiple studies of the same phenotype than competing methods. CONCLUSIONS: Together these findings suggest that GeneSurrounder provides a new avenue for identifying individual genes that can be targeted therapeutically. The key innovation of GeneSurrounder is the combination of pathway network information with gene expression data to determine the degree to which a gene is a source of dysregulation on the network. By prioritizing genes in this way, our method provides insights into disease mechanisms and suggests diagnostic and therapeutic targets. Our method can be used to help biologists select among tens or hundreds of genes for further validation. The implementation in R is available at github.com/sahildshah1/gene-surrounder. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2829-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-05-06 /pmc/articles/PMC6503437/ /pubmed/31060502 http://dx.doi.org/10.1186/s12859-019-2829-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Shah, Sahil D.
Braun, Rosemary
GeneSurrounder: network-based identification of disease genes in expression data
title GeneSurrounder: network-based identification of disease genes in expression data
title_full GeneSurrounder: network-based identification of disease genes in expression data
title_fullStr GeneSurrounder: network-based identification of disease genes in expression data
title_full_unstemmed GeneSurrounder: network-based identification of disease genes in expression data
title_short GeneSurrounder: network-based identification of disease genes in expression data
title_sort genesurrounder: network-based identification of disease genes in expression data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6503437/
https://www.ncbi.nlm.nih.gov/pubmed/31060502
http://dx.doi.org/10.1186/s12859-019-2829-y
work_keys_str_mv AT shahsahild genesurroundernetworkbasedidentificationofdiseasegenesinexpressiondata
AT braunrosemary genesurroundernetworkbasedidentificationofdiseasegenesinexpressiondata