Cargando…

Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis

Background: Recent improvements in DNA microarray techniques have made a large variety of gene expression data available in public databases. This data can be used to evaluate the strength of gene coexpression by calculating the correlation of expression patterns among different genes between many e...

Descripción completa

Detalles Bibliográficos
Autores principales: Kinoshita, Kengo, Obayashi, Takeshi
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2759550/
https://www.ncbi.nlm.nih.gov/pubmed/19620096
http://dx.doi.org/10.1093/bioinformatics/btp442
_version_ 1782172683080826880
author Kinoshita, Kengo
Obayashi, Takeshi
author_facet Kinoshita, Kengo
Obayashi, Takeshi
author_sort Kinoshita, Kengo
collection PubMed
description Background: Recent improvements in DNA microarray techniques have made a large variety of gene expression data available in public databases. This data can be used to evaluate the strength of gene coexpression by calculating the correlation of expression patterns among different genes between many experiments. However, gene expression levels differ significantly across various tissues in higher organisms, as well as in different cellular location in eukaryotes in different cell state. Thus the usual correlation measure can only evaluate the difference of tissues or cellular localizations, and cannot adequately elucidate the functional relationship from the coexpression of genes. Method: We propose a new measure of coexpression by expanding the generally used correlation into a multidimensional one. We used principal component analyses to identify the major factors of gene expression correlation, and then re-calculate the correlation by subtracting the major components in order to remove biases cased by a few experiments. The repeated subtractions of the major components yielded a set of correlation values for each pair of genes. We observed the correlation changes when the first ten principal components were subtracted step-by-step in large-scale Arabidopsis expression data. Results: We found two extreme patterns of correlation changes, corresponding to stable and fragile coexpression. Our new indexes provided a good means to determine the functional relationships of the genes, by examining a few examples, and higher performance of Gene Ontology term prediction by using the support vector machine and the multidimensional correlation. Availability: The results are available from the expression detail pages in ATTED-II (http://atted.jp). Contact: kinosita@hgc.jp Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2759550
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27595502009-10-15 Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis Kinoshita, Kengo Obayashi, Takeshi Bioinformatics Original Papers Background: Recent improvements in DNA microarray techniques have made a large variety of gene expression data available in public databases. This data can be used to evaluate the strength of gene coexpression by calculating the correlation of expression patterns among different genes between many experiments. However, gene expression levels differ significantly across various tissues in higher organisms, as well as in different cellular location in eukaryotes in different cell state. Thus the usual correlation measure can only evaluate the difference of tissues or cellular localizations, and cannot adequately elucidate the functional relationship from the coexpression of genes. Method: We propose a new measure of coexpression by expanding the generally used correlation into a multidimensional one. We used principal component analyses to identify the major factors of gene expression correlation, and then re-calculate the correlation by subtracting the major components in order to remove biases cased by a few experiments. The repeated subtractions of the major components yielded a set of correlation values for each pair of genes. We observed the correlation changes when the first ten principal components were subtracted step-by-step in large-scale Arabidopsis expression data. Results: We found two extreme patterns of correlation changes, corresponding to stable and fragile coexpression. Our new indexes provided a good means to determine the functional relationships of the genes, by examining a few examples, and higher performance of Gene Ontology term prediction by using the support vector machine and the multidimensional correlation. Availability: The results are available from the expression detail pages in ATTED-II (http://atted.jp). Contact: kinosita@hgc.jp Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2009-10-15 2009-07-20 /pmc/articles/PMC2759550/ /pubmed/19620096 http://dx.doi.org/10.1093/bioinformatics/btp442 Text en © 2009 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Kinoshita, Kengo
Obayashi, Takeshi
Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis
title Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis
title_full Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis
title_fullStr Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis
title_full_unstemmed Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis
title_short Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis
title_sort multi-dimensional correlations for gene coexpression and application to the large-scale data of arabidopsis
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2759550/
https://www.ncbi.nlm.nih.gov/pubmed/19620096
http://dx.doi.org/10.1093/bioinformatics/btp442
work_keys_str_mv AT kinoshitakengo multidimensionalcorrelationsforgenecoexpressionandapplicationtothelargescaledataofarabidopsis
AT obayashitakeshi multidimensionalcorrelationsforgenecoexpressionandapplicationtothelargescaledataofarabidopsis