Cargando…

mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks

BACKGROUND: Identification of cooperative gene regulatory network is an important topic for biological study especially in cancer research. Traditional approaches suffer from large noise in gene expression data and false positive connections in motif binding data; they also fail to identify the modu...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Xu, Gu, Jinghua, Chen, Xi, Shajahan, Ayesha, Hilakivi-Clarke, Leena, Clarke, Robert, Xuan, Jianhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028818/
https://www.ncbi.nlm.nih.gov/pubmed/24564939
http://dx.doi.org/10.1186/1752-0509-7-S5-S4
_version_ 1782317111969841152
author Shi, Xu
Gu, Jinghua
Chen, Xi
Shajahan, Ayesha
Hilakivi-Clarke, Leena
Clarke, Robert
Xuan, Jianhua
author_facet Shi, Xu
Gu, Jinghua
Chen, Xi
Shajahan, Ayesha
Hilakivi-Clarke, Leena
Clarke, Robert
Xuan, Jianhua
author_sort Shi, Xu
collection PubMed
description BACKGROUND: Identification of cooperative gene regulatory network is an important topic for biological study especially in cancer research. Traditional approaches suffer from large noise in gene expression data and false positive connections in motif binding data; they also fail to identify the modularized structure of gene regulatory network. Methods that are capable of revealing underlying modularized structure and robust to noise and false positives are needed to be developed. RESULTS: We proposed and developed an integrated approach to identify gene regulatory networks, which consists of a novel clustering method (namely motif-guided affinity propagation clustering (mAPC)) and a sampling based method (called Gibbs sampler based on outlier sum statistic (GibbsOS)). mAPC is used in the first step to obtain co-regulated gene modules by clustering genes with a similarity measurement taking into account both gene expression data and binding motif information. This clustering method can reduce the noise effect from microarray data to obtain modularized gene clusters. However, due to many false positives in motif binding data, some genes not regulated by certain transcription factors (TFs) will be falsely clustered with true target genes. To overcome this problem, GibbsOS is applied in the second step to refine each cluster for the identification of true target genes. In order to evaluate the performance of the proposed method, we generated simulation data under different signal-to-noise ratios and false positive ratios to test the method. The experimental results show an improved accuracy in terms of clustering and transcription factor identification. Moreover, an improved performance is demonstrated in target gene identification as compared with GibbsOS. Finally, we applied the proposed method to two breast cancer patient datasets to identify cooperative transcriptional regulatory networks associated with recurrence of breast cancer, as supported by their functional annotations. CONCLUSIONS: We have developed a two-step approach for gene regulatory network identification, featuring an integrated method to identify modularized regulatory structures and refine their target genes subsequently. Simulation studies have shown the robustness of the method against noise in gene expression data and false positives in motif binding data. The proposed method has been applied to two breast cancer gene expression datasets to infer the hidden regulation mechanisms. The experimental results demonstrate the efficacy of the method in identifying key regulatory networks related to the progression and recurrence of breast cancer.
format Online
Article
Text
id pubmed-4028818
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40288182014-06-19 mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks Shi, Xu Gu, Jinghua Chen, Xi Shajahan, Ayesha Hilakivi-Clarke, Leena Clarke, Robert Xuan, Jianhua BMC Syst Biol Research BACKGROUND: Identification of cooperative gene regulatory network is an important topic for biological study especially in cancer research. Traditional approaches suffer from large noise in gene expression data and false positive connections in motif binding data; they also fail to identify the modularized structure of gene regulatory network. Methods that are capable of revealing underlying modularized structure and robust to noise and false positives are needed to be developed. RESULTS: We proposed and developed an integrated approach to identify gene regulatory networks, which consists of a novel clustering method (namely motif-guided affinity propagation clustering (mAPC)) and a sampling based method (called Gibbs sampler based on outlier sum statistic (GibbsOS)). mAPC is used in the first step to obtain co-regulated gene modules by clustering genes with a similarity measurement taking into account both gene expression data and binding motif information. This clustering method can reduce the noise effect from microarray data to obtain modularized gene clusters. However, due to many false positives in motif binding data, some genes not regulated by certain transcription factors (TFs) will be falsely clustered with true target genes. To overcome this problem, GibbsOS is applied in the second step to refine each cluster for the identification of true target genes. In order to evaluate the performance of the proposed method, we generated simulation data under different signal-to-noise ratios and false positive ratios to test the method. The experimental results show an improved accuracy in terms of clustering and transcription factor identification. Moreover, an improved performance is demonstrated in target gene identification as compared with GibbsOS. Finally, we applied the proposed method to two breast cancer patient datasets to identify cooperative transcriptional regulatory networks associated with recurrence of breast cancer, as supported by their functional annotations. CONCLUSIONS: We have developed a two-step approach for gene regulatory network identification, featuring an integrated method to identify modularized regulatory structures and refine their target genes subsequently. Simulation studies have shown the robustness of the method against noise in gene expression data and false positives in motif binding data. The proposed method has been applied to two breast cancer gene expression datasets to infer the hidden regulation mechanisms. The experimental results demonstrate the efficacy of the method in identifying key regulatory networks related to the progression and recurrence of breast cancer. BioMed Central 2013-12-09 /pmc/articles/PMC4028818/ /pubmed/24564939 http://dx.doi.org/10.1186/1752-0509-7-S5-S4 Text en Copyright © 2013 Shi et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Shi, Xu
Gu, Jinghua
Chen, Xi
Shajahan, Ayesha
Hilakivi-Clarke, Leena
Clarke, Robert
Xuan, Jianhua
mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks
title mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks
title_full mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks
title_fullStr mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks
title_full_unstemmed mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks
title_short mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks
title_sort mapc-gibbsos: an integrated approach for robust identification of gene regulatory networks
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028818/
https://www.ncbi.nlm.nih.gov/pubmed/24564939
http://dx.doi.org/10.1186/1752-0509-7-S5-S4
work_keys_str_mv AT shixu mapcgibbsosanintegratedapproachforrobustidentificationofgeneregulatorynetworks
AT gujinghua mapcgibbsosanintegratedapproachforrobustidentificationofgeneregulatorynetworks
AT chenxi mapcgibbsosanintegratedapproachforrobustidentificationofgeneregulatorynetworks
AT shajahanayesha mapcgibbsosanintegratedapproachforrobustidentificationofgeneregulatorynetworks
AT hilakiviclarkeleena mapcgibbsosanintegratedapproachforrobustidentificationofgeneregulatorynetworks
AT clarkerobert mapcgibbsosanintegratedapproachforrobustidentificationofgeneregulatorynetworks
AT xuanjianhua mapcgibbsosanintegratedapproachforrobustidentificationofgeneregulatorynetworks