Cargando…

Locational distribution of gene functional classes in Arabidopsis thaliana

BACKGROUND: We are interested in understanding the locational distribution of genes and their functions in genomes, as this distribution has both functional and evolutionary significance. Gene locational distribution is known to be affected by various evolutionary processes, with tandem duplication...

Descripción completa

Detalles Bibliográficos
Autores principales: Riley, Michael C, Clare, Amanda, King, Ross D
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1855069/
https://www.ncbi.nlm.nih.gov/pubmed/17397552
http://dx.doi.org/10.1186/1471-2105-8-112
_version_ 1782133135460270080
author Riley, Michael C
Clare, Amanda
King, Ross D
author_facet Riley, Michael C
Clare, Amanda
King, Ross D
author_sort Riley, Michael C
collection PubMed
description BACKGROUND: We are interested in understanding the locational distribution of genes and their functions in genomes, as this distribution has both functional and evolutionary significance. Gene locational distribution is known to be affected by various evolutionary processes, with tandem duplication thought to be the main process producing clustering of homologous sequences. Recent research has found clustering of protein structural families in the human genome, even when genes identified as tandem duplicates have been removed from the data. However, this previous research was hindered as they were unable to analyse small sample sizes. This is a challenge for bioinformatics as more specific functional classes have fewer examples and conventional statistical analyses of these small data sets often produces unsatisfactory results. RESULTS: We have developed a novel bioinformatics method based on Monte Carlo methods and Greenwood's spacing statistic for the computational analysis of the distribution of individual functional classes of genes (from GO). We used this to make the first comprehensive statistical analysis of the relationship between gene functional class and location on a genome. Analysis of the distribution of all genes except tandem duplicates on the five chromosomes of A. thaliana reveals that the distribution on chromosomes I, II, IV and V is clustered at P = 0.001. Many functional classes are clustered, with the degree of clustering within an individual class generally consistent across all five chromosomes. A novel and surprising result was that the locational distribution of some functional classes were significantly more evenly spaced than would be expected by chance. CONCLUSION: Analysis of the A. thaliana genome reveals evidence of unexplained order in the locational distribution of genes. The same general analysis method can be applied to any genome, and indeed any sequential data involving classes.
format Text
id pubmed-1855069
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18550692007-04-30 Locational distribution of gene functional classes in Arabidopsis thaliana Riley, Michael C Clare, Amanda King, Ross D BMC Bioinformatics Research Article BACKGROUND: We are interested in understanding the locational distribution of genes and their functions in genomes, as this distribution has both functional and evolutionary significance. Gene locational distribution is known to be affected by various evolutionary processes, with tandem duplication thought to be the main process producing clustering of homologous sequences. Recent research has found clustering of protein structural families in the human genome, even when genes identified as tandem duplicates have been removed from the data. However, this previous research was hindered as they were unable to analyse small sample sizes. This is a challenge for bioinformatics as more specific functional classes have fewer examples and conventional statistical analyses of these small data sets often produces unsatisfactory results. RESULTS: We have developed a novel bioinformatics method based on Monte Carlo methods and Greenwood's spacing statistic for the computational analysis of the distribution of individual functional classes of genes (from GO). We used this to make the first comprehensive statistical analysis of the relationship between gene functional class and location on a genome. Analysis of the distribution of all genes except tandem duplicates on the five chromosomes of A. thaliana reveals that the distribution on chromosomes I, II, IV and V is clustered at P = 0.001. Many functional classes are clustered, with the degree of clustering within an individual class generally consistent across all five chromosomes. A novel and surprising result was that the locational distribution of some functional classes were significantly more evenly spaced than would be expected by chance. CONCLUSION: Analysis of the A. thaliana genome reveals evidence of unexplained order in the locational distribution of genes. The same general analysis method can be applied to any genome, and indeed any sequential data involving classes. BioMed Central 2007-03-30 /pmc/articles/PMC1855069/ /pubmed/17397552 http://dx.doi.org/10.1186/1471-2105-8-112 Text en Copyright © 2007 Riley et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Riley, Michael C
Clare, Amanda
King, Ross D
Locational distribution of gene functional classes in Arabidopsis thaliana
title Locational distribution of gene functional classes in Arabidopsis thaliana
title_full Locational distribution of gene functional classes in Arabidopsis thaliana
title_fullStr Locational distribution of gene functional classes in Arabidopsis thaliana
title_full_unstemmed Locational distribution of gene functional classes in Arabidopsis thaliana
title_short Locational distribution of gene functional classes in Arabidopsis thaliana
title_sort locational distribution of gene functional classes in arabidopsis thaliana
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1855069/
https://www.ncbi.nlm.nih.gov/pubmed/17397552
http://dx.doi.org/10.1186/1471-2105-8-112
work_keys_str_mv AT rileymichaelc locationaldistributionofgenefunctionalclassesinarabidopsisthaliana
AT clareamanda locationaldistributionofgenefunctionalclassesinarabidopsisthaliana
AT kingrossd locationaldistributionofgenefunctionalclassesinarabidopsisthaliana