Cargando…
Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis
Background: Recent improvements in DNA microarray techniques have made a large variety of gene expression data available in public databases. This data can be used to evaluate the strength of gene coexpression by calculating the correlation of expression patterns among different genes between many e...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2759550/ https://www.ncbi.nlm.nih.gov/pubmed/19620096 http://dx.doi.org/10.1093/bioinformatics/btp442 |
_version_ | 1782172683080826880 |
---|---|
author | Kinoshita, Kengo Obayashi, Takeshi |
author_facet | Kinoshita, Kengo Obayashi, Takeshi |
author_sort | Kinoshita, Kengo |
collection | PubMed |
description | Background: Recent improvements in DNA microarray techniques have made a large variety of gene expression data available in public databases. This data can be used to evaluate the strength of gene coexpression by calculating the correlation of expression patterns among different genes between many experiments. However, gene expression levels differ significantly across various tissues in higher organisms, as well as in different cellular location in eukaryotes in different cell state. Thus the usual correlation measure can only evaluate the difference of tissues or cellular localizations, and cannot adequately elucidate the functional relationship from the coexpression of genes. Method: We propose a new measure of coexpression by expanding the generally used correlation into a multidimensional one. We used principal component analyses to identify the major factors of gene expression correlation, and then re-calculate the correlation by subtracting the major components in order to remove biases cased by a few experiments. The repeated subtractions of the major components yielded a set of correlation values for each pair of genes. We observed the correlation changes when the first ten principal components were subtracted step-by-step in large-scale Arabidopsis expression data. Results: We found two extreme patterns of correlation changes, corresponding to stable and fragile coexpression. Our new indexes provided a good means to determine the functional relationships of the genes, by examining a few examples, and higher performance of Gene Ontology term prediction by using the support vector machine and the multidimensional correlation. Availability: The results are available from the expression detail pages in ATTED-II (http://atted.jp). Contact: kinosita@hgc.jp Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Text |
id | pubmed-2759550 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-27595502009-10-15 Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis Kinoshita, Kengo Obayashi, Takeshi Bioinformatics Original Papers Background: Recent improvements in DNA microarray techniques have made a large variety of gene expression data available in public databases. This data can be used to evaluate the strength of gene coexpression by calculating the correlation of expression patterns among different genes between many experiments. However, gene expression levels differ significantly across various tissues in higher organisms, as well as in different cellular location in eukaryotes in different cell state. Thus the usual correlation measure can only evaluate the difference of tissues or cellular localizations, and cannot adequately elucidate the functional relationship from the coexpression of genes. Method: We propose a new measure of coexpression by expanding the generally used correlation into a multidimensional one. We used principal component analyses to identify the major factors of gene expression correlation, and then re-calculate the correlation by subtracting the major components in order to remove biases cased by a few experiments. The repeated subtractions of the major components yielded a set of correlation values for each pair of genes. We observed the correlation changes when the first ten principal components were subtracted step-by-step in large-scale Arabidopsis expression data. Results: We found two extreme patterns of correlation changes, corresponding to stable and fragile coexpression. Our new indexes provided a good means to determine the functional relationships of the genes, by examining a few examples, and higher performance of Gene Ontology term prediction by using the support vector machine and the multidimensional correlation. Availability: The results are available from the expression detail pages in ATTED-II (http://atted.jp). Contact: kinosita@hgc.jp Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2009-10-15 2009-07-20 /pmc/articles/PMC2759550/ /pubmed/19620096 http://dx.doi.org/10.1093/bioinformatics/btp442 Text en © 2009 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Kinoshita, Kengo Obayashi, Takeshi Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis |
title | Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis |
title_full | Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis |
title_fullStr | Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis |
title_full_unstemmed | Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis |
title_short | Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis |
title_sort | multi-dimensional correlations for gene coexpression and application to the large-scale data of arabidopsis |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2759550/ https://www.ncbi.nlm.nih.gov/pubmed/19620096 http://dx.doi.org/10.1093/bioinformatics/btp442 |
work_keys_str_mv | AT kinoshitakengo multidimensionalcorrelationsforgenecoexpressionandapplicationtothelargescaledataofarabidopsis AT obayashitakeshi multidimensionalcorrelationsforgenecoexpressionandapplicationtothelargescaledataofarabidopsis |