Cargando…

Statistical identification of gene association by CID in application of constructing ER regulatory network

BACKGROUND: A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association,...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Li-Yu D, Chen, Chien-Yu, Chen, Mei-Ju M, Tsai, Ming-Shian, Lee, Cho-Han S, Phang, Tzu L, Chang, Li-Yun, Kuo, Wen-Hung, Hwa, Hsiao-Lin, Lien, Huang-Chun, Jung, Shih-Ming, Lin, Yi-Shing, Chang, King-Jen, Hsieh, Fon-Jou
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679734/
https://www.ncbi.nlm.nih.gov/pubmed/19292896
http://dx.doi.org/10.1186/1471-2105-10-85
_version_ 1782166916713938944
author Liu, Li-Yu D
Chen, Chien-Yu
Chen, Mei-Ju M
Tsai, Ming-Shian
Lee, Cho-Han S
Phang, Tzu L
Chang, Li-Yun
Kuo, Wen-Hung
Hwa, Hsiao-Lin
Lien, Huang-Chun
Jung, Shih-Ming
Lin, Yi-Shing
Chang, King-Jen
Hsieh, Fon-Jou
author_facet Liu, Li-Yu D
Chen, Chien-Yu
Chen, Mei-Ju M
Tsai, Ming-Shian
Lee, Cho-Han S
Phang, Tzu L
Chang, Li-Yun
Kuo, Wen-Hung
Hwa, Hsiao-Lin
Lien, Huang-Chun
Jung, Shih-Ming
Lin, Yi-Shing
Chang, King-Jen
Hsieh, Fon-Jou
author_sort Liu, Li-Yu D
collection PubMed
description BACKGROUND: A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor α (ERα) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A). RESULTS: The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. CONCLUSION: CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers. AVAILABILITY: the implementation of CID in R codes can be freely downloaded from .
format Text
id pubmed-2679734
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26797342009-05-11 Statistical identification of gene association by CID in application of constructing ER regulatory network Liu, Li-Yu D Chen, Chien-Yu Chen, Mei-Ju M Tsai, Ming-Shian Lee, Cho-Han S Phang, Tzu L Chang, Li-Yun Kuo, Wen-Hung Hwa, Hsiao-Lin Lien, Huang-Chun Jung, Shih-Ming Lin, Yi-Shing Chang, King-Jen Hsieh, Fon-Jou BMC Bioinformatics Research Article BACKGROUND: A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor α (ERα) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A). RESULTS: The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. CONCLUSION: CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers. AVAILABILITY: the implementation of CID in R codes can be freely downloaded from . BioMed Central 2009-03-17 /pmc/articles/PMC2679734/ /pubmed/19292896 http://dx.doi.org/10.1186/1471-2105-10-85 Text en Copyright © 2009 Liu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Liu, Li-Yu D
Chen, Chien-Yu
Chen, Mei-Ju M
Tsai, Ming-Shian
Lee, Cho-Han S
Phang, Tzu L
Chang, Li-Yun
Kuo, Wen-Hung
Hwa, Hsiao-Lin
Lien, Huang-Chun
Jung, Shih-Ming
Lin, Yi-Shing
Chang, King-Jen
Hsieh, Fon-Jou
Statistical identification of gene association by CID in application of constructing ER regulatory network
title Statistical identification of gene association by CID in application of constructing ER regulatory network
title_full Statistical identification of gene association by CID in application of constructing ER regulatory network
title_fullStr Statistical identification of gene association by CID in application of constructing ER regulatory network
title_full_unstemmed Statistical identification of gene association by CID in application of constructing ER regulatory network
title_short Statistical identification of gene association by CID in application of constructing ER regulatory network
title_sort statistical identification of gene association by cid in application of constructing er regulatory network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679734/
https://www.ncbi.nlm.nih.gov/pubmed/19292896
http://dx.doi.org/10.1186/1471-2105-10-85
work_keys_str_mv AT liuliyud statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT chenchienyu statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT chenmeijum statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT tsaimingshian statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT leechohans statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT phangtzul statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT changliyun statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT kuowenhung statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT hwahsiaolin statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT lienhuangchun statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT jungshihming statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT linyishing statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT changkingjen statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork
AT hsiehfonjou statisticalidentificationofgeneassociationbycidinapplicationofconstructingerregulatorynetwork