Cargando…
Identification of functionally related genes using data mining and data integration: a breast cancer case study
BACKGROUND: The identification of the organisation and dynamics of molecular pathways is crucial for the understanding of cell function. In order to reconstruct the molecular pathways in which a gene of interest is involved in regulating a cell, it is important to identify the set of genes to which...
Autores principales: | , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2762073/ https://www.ncbi.nlm.nih.gov/pubmed/19828084 http://dx.doi.org/10.1186/1471-2105-10-S12-S8 |
_version_ | 1782172891584921600 |
---|---|
author | Mosca, Ettore Bertoli, Gloria Piscitelli, Eleonora Vilardo, Laura Reinbold, Rolland A Zucchi, Ileana Milanesi, Luciano |
author_facet | Mosca, Ettore Bertoli, Gloria Piscitelli, Eleonora Vilardo, Laura Reinbold, Rolland A Zucchi, Ileana Milanesi, Luciano |
author_sort | Mosca, Ettore |
collection | PubMed |
description | BACKGROUND: The identification of the organisation and dynamics of molecular pathways is crucial for the understanding of cell function. In order to reconstruct the molecular pathways in which a gene of interest is involved in regulating a cell, it is important to identify the set of genes to which it interacts with to determine cell function. In this context, the mining and the integration of a large amount of publicly available data, regarding the transcriptome and the proteome states of a cell, are a useful resource to complement biological research. RESULTS: We describe an approach for the identification of genes that interact with each other to regulate cell function. The strategy relies on the analysis of gene expression profile similarity, considering large datasets of expression data. During the similarity evaluation, the methodology determines the most significant subset of samples in which the evaluated genes are highly correlated. Hence, the strategy enables the exclusion of samples that are not relevant for each gene pair analysed. This feature is important when considering a large set of samples characterised by heterogeneous experimental conditions where different pools of biological processes can be active across the samples. The putative partners of the studied gene are then further characterised, analysing the distribution of the Gene Ontology terms and integrating the protein-protein interaction (PPI) data. The strategy was applied for the analysis of the functional relationships of a gene of known function, Pyruvate Kinase, and for the prediction of functional partners of the human transcription factor TBX3. In both cases the analysis was done on a dataset composed by breast primary tumour expression data derived from the literature. Integration and analysis of PPI data confirmed the prediction of the methodology, since the genes identified to be functionally related were associated to proteins close in the PPI network. Two genes among the predicted putative partners of TBX3 (GLI3 and GATA3) were confirmed by in vivo binding assays (crosslinking immunoprecipitation, X-ChIP) in which the putative DNA enhancer sequence sites of GATA3 and GLI3 were found to be bound by the Tbx3 protein. CONCLUSION: The presented strategy is demonstrated to be an effective approach to identify genes that establish functional relationships. The methodology identifies and characterises genes with a similar expression profile, through data mining and integrating data from publicly available resources, to contribute to a better understanding of gene regulation and cell function. The prediction of the TBX3 target genes GLI3 and GATA3 was experimentally confirmed. |
format | Text |
id | pubmed-2762073 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27620732009-10-15 Identification of functionally related genes using data mining and data integration: a breast cancer case study Mosca, Ettore Bertoli, Gloria Piscitelli, Eleonora Vilardo, Laura Reinbold, Rolland A Zucchi, Ileana Milanesi, Luciano BMC Bioinformatics Research BACKGROUND: The identification of the organisation and dynamics of molecular pathways is crucial for the understanding of cell function. In order to reconstruct the molecular pathways in which a gene of interest is involved in regulating a cell, it is important to identify the set of genes to which it interacts with to determine cell function. In this context, the mining and the integration of a large amount of publicly available data, regarding the transcriptome and the proteome states of a cell, are a useful resource to complement biological research. RESULTS: We describe an approach for the identification of genes that interact with each other to regulate cell function. The strategy relies on the analysis of gene expression profile similarity, considering large datasets of expression data. During the similarity evaluation, the methodology determines the most significant subset of samples in which the evaluated genes are highly correlated. Hence, the strategy enables the exclusion of samples that are not relevant for each gene pair analysed. This feature is important when considering a large set of samples characterised by heterogeneous experimental conditions where different pools of biological processes can be active across the samples. The putative partners of the studied gene are then further characterised, analysing the distribution of the Gene Ontology terms and integrating the protein-protein interaction (PPI) data. The strategy was applied for the analysis of the functional relationships of a gene of known function, Pyruvate Kinase, and for the prediction of functional partners of the human transcription factor TBX3. In both cases the analysis was done on a dataset composed by breast primary tumour expression data derived from the literature. Integration and analysis of PPI data confirmed the prediction of the methodology, since the genes identified to be functionally related were associated to proteins close in the PPI network. Two genes among the predicted putative partners of TBX3 (GLI3 and GATA3) were confirmed by in vivo binding assays (crosslinking immunoprecipitation, X-ChIP) in which the putative DNA enhancer sequence sites of GATA3 and GLI3 were found to be bound by the Tbx3 protein. CONCLUSION: The presented strategy is demonstrated to be an effective approach to identify genes that establish functional relationships. The methodology identifies and characterises genes with a similar expression profile, through data mining and integrating data from publicly available resources, to contribute to a better understanding of gene regulation and cell function. The prediction of the TBX3 target genes GLI3 and GATA3 was experimentally confirmed. BioMed Central 2009-10-15 /pmc/articles/PMC2762073/ /pubmed/19828084 http://dx.doi.org/10.1186/1471-2105-10-S12-S8 Text en Copyright © 2009 Mosca et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Mosca, Ettore Bertoli, Gloria Piscitelli, Eleonora Vilardo, Laura Reinbold, Rolland A Zucchi, Ileana Milanesi, Luciano Identification of functionally related genes using data mining and data integration: a breast cancer case study |
title | Identification of functionally related genes using data mining and data integration: a breast cancer case study |
title_full | Identification of functionally related genes using data mining and data integration: a breast cancer case study |
title_fullStr | Identification of functionally related genes using data mining and data integration: a breast cancer case study |
title_full_unstemmed | Identification of functionally related genes using data mining and data integration: a breast cancer case study |
title_short | Identification of functionally related genes using data mining and data integration: a breast cancer case study |
title_sort | identification of functionally related genes using data mining and data integration: a breast cancer case study |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2762073/ https://www.ncbi.nlm.nih.gov/pubmed/19828084 http://dx.doi.org/10.1186/1471-2105-10-S12-S8 |
work_keys_str_mv | AT moscaettore identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy AT bertoligloria identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy AT piscitellieleonora identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy AT vilardolaura identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy AT reinboldrollanda identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy AT zucchiileana identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy AT milanesiluciano identificationoffunctionallyrelatedgenesusingdatamininganddataintegrationabreastcancercasestudy |