Cargando…

Identification of functionally related genes using data mining and data integration: a breast cancer case study

BACKGROUND: The identification of the organisation and dynamics of molecular pathways is crucial for the understanding of cell function. In order to reconstruct the molecular pathways in which a gene of interest is involved in regulating a cell, it is important to identify the set of genes to which...

Descripción completa

Detalles Bibliográficos
Autores principales: Mosca, Ettore, Bertoli, Gloria, Piscitelli, Eleonora, Vilardo, Laura, Reinbold, Rolland A, Zucchi, Ileana, Milanesi, Luciano
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2762073/
https://www.ncbi.nlm.nih.gov/pubmed/19828084
http://dx.doi.org/10.1186/1471-2105-10-S12-S8
_version_ 1782172891584921600
author Mosca, Ettore
Bertoli, Gloria
Piscitelli, Eleonora
Vilardo, Laura
Reinbold, Rolland A
Zucchi, Ileana
Milanesi, Luciano
author_facet Mosca, Ettore
Bertoli, Gloria
Piscitelli, Eleonora
Vilardo, Laura
Reinbold, Rolland A
Zucchi, Ileana
Milanesi, Luciano
author_sort Mosca, Ettore
collection PubMed
description BACKGROUND: The identification of the organisation and dynamics of molecular pathways is crucial for the understanding of cell function. In order to reconstruct the molecular pathways in which a gene of interest is involved in regulating a cell, it is important to identify the set of genes to which it interacts with to determine cell function. In this context, the mining and the integration of a large amount of publicly available data, regarding the transcriptome and the proteome states of a cell, are a useful resource to complement biological research. RESULTS: We describe an approach for the identification of genes that interact with each other to regulate cell function. The strategy relies on the analysis of gene expression profile similarity, considering large datasets of expression data. During the similarity evaluation, the methodology determines the most significant subset of samples in which the evaluated genes are highly correlated. Hence, the strategy enables the exclusion of samples that are not relevant for each gene pair analysed. This feature is important when considering a large set of samples characterised by heterogeneous experimental conditions where different pools of biological processes can be active across the samples. The putative partners of the studied gene are then further characterised, analysing the distribution of the Gene Ontology terms and integrating the protein-protein interaction (PPI) data. The strategy was applied for the analysis of the functional relationships of a gene of known function, Pyruvate Kinase, and for the prediction of functional partners of the human transcription factor TBX3. In both cases the analysis was done on a dataset composed by breast primary tumour expression data derived from the literature. Integration and analysis of PPI data confirmed the prediction of the methodology, since the genes identified to be functionally related were associated to proteins close in the PPI network. Two genes among the predicted putative partners of TBX3 (GLI3 and GATA3) were confirmed by in vivo binding assays (crosslinking immunoprecipitation, X-ChIP) in which the putative DNA enhancer sequence sites of GATA3 and GLI3 were found to be bound by the Tbx3 protein. CONCLUSION: The presented strategy is demonstrated to be an effective approach to identify genes that establish functional relationships. The methodology identifies and characterises genes with a similar expression profile, through data mining and integrating data from publicly available resources, to contribute to a better understanding of gene regulation and cell function. The prediction of the TBX3 target genes GLI3 and GATA3 was experimentally confirmed.
format Text
id pubmed-2762073
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27620732009-10-15 Identification of functionally related genes using data mining and data integration: a breast cancer case study Mosca, Ettore Bertoli, Gloria Piscitelli, Eleonora Vilardo, Laura Reinbold, Rolland A Zucchi, Ileana Milanesi, Luciano BMC Bioinformatics Research BACKGROUND: The identification of the organisation and dynamics of molecular pathways is crucial for the understanding of cell function. In order to reconstruct the molecular pathways in which a gene of interest is involved in regulating a cell, it is important to identify the set of genes to which it interacts with to determine cell function. In this context, the mining and the integration of a large amount of publicly available data, regarding the transcriptome and the proteome states of a cell, are a useful resource to complement biological research. RESULTS: We describe an approach for the identification of genes that interact with each other to regulate cell function. The strategy relies on the analysis of gene expression profile similarity, considering large datasets of expression data. During the similarity evaluation, the methodology determines the most significant subset of samples in which the evaluated genes are highly correlated. Hence, the strategy enables the exclusion of samples that are not relevant for each gene pair analysed. This feature is important when considering a large set of samples characterised by heterogeneous experimental conditions where different pools of biological processes can be active across the samples. The putative partners of the studied gene are then further characterised, analysing the distribution of the Gene Ontology terms and integrating the protein-protein interaction (PPI) data. The strategy was applied for the analysis of the functional relationships of a gene of known function, Pyruvate Kinase, and for the prediction of functional partners of the human transcription factor TBX3. In both cases the analysis was done on a dataset composed by breast primary tumour expression data derived from the literature. Integration and analysis of PPI data confirmed the prediction of the methodology, since the genes identified to be functionally related were associated to proteins close in the PPI network. Two genes among the predicted putative partners of TBX3 (GLI3 and GATA3) were confirmed by in vivo binding assays (crosslinking immunoprecipitation, X-ChIP) in which the putative DNA enhancer sequence sites of GATA3 and GLI3 were found to be bound by the Tbx3 protein. CONCLUSION: The presented strategy is demonstrated to be an effective approach to identify genes that establish functional relationships. The methodology identifies and characterises genes with a similar expression profile, through data mining and integrating data from publicly available resources, to contribute to a better understanding of gene regulation and cell function. The prediction of the TBX3 target genes GLI3 and GATA3 was experimentally confirmed. BioMed Central 2009-10-15 /pmc/articles/PMC2762073/ /pubmed/19828084 http://dx.doi.org/10.1186/1471-2105-10-S12-S8 Text en Copyright © 2009 Mosca et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Mosca, Ettore
Bertoli, Gloria
Piscitelli, Eleonora
Vilardo, Laura
Reinbold, Rolland A
Zucchi, Ileana
Milanesi, Luciano
Identification of functionally related genes using data mining and data integration: a breast cancer case study
title Identification of functionally related genes using data mining and data integration: a breast cancer case study
title_full Identification of functionally related genes using data mining and data integration: a breast cancer case study
title_fullStr Identification of functionally related genes using data mining and data integration: a breast cancer case study
title_full_unstemmed Identification of functionally related genes using data mining and data integration: a breast cancer case study
title_short Identification of functionally related genes using data mining and data integration: a breast cancer case study
title_sort identification of functionally related genes using data mining and data integration: a breast cancer case study
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2762073/
https://www.ncbi.nlm.nih.gov/pubmed/19828084
http://dx.doi.org/10.1186/1471-2105-10-S12-S8
work_keys_str_mv AT moscaettore identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy
AT bertoligloria identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy
AT piscitellieleonora identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy
AT vilardolaura identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy
AT reinboldrollanda identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy
AT zucchiileana identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy
AT milanesiluciano identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy