Cargando…

Double feature selection and cluster analyses in mining of microarray data from cotton

BACKGROUND: Cotton fiber is a single-celled seed trichome of major biological and economic importance. In recent years, genomic approaches such as microarray-based expression profiling were used to study fiber growth and development to understand the developmental mechanisms of fiber at the molecula...

Descripción completa

Detalles Bibliográficos
Autores principales:	Alabady, Magdy S, Youn, Eunseog, Wilkins, Thea A
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2441630/ https://www.ncbi.nlm.nih.gov/pubmed/18570655 http://dx.doi.org/10.1186/1471-2164-9-295

_version_	1782156625888411648
author	Alabady, Magdy S Youn, Eunseog Wilkins, Thea A
author_facet	Alabady, Magdy S Youn, Eunseog Wilkins, Thea A
author_sort	Alabady, Magdy S
collection	PubMed
description	BACKGROUND: Cotton fiber is a single-celled seed trichome of major biological and economic importance. In recent years, genomic approaches such as microarray-based expression profiling were used to study fiber growth and development to understand the developmental mechanisms of fiber at the molecular level. The vast volume of microarray expression data generated requires a sophisticated means of data mining in order to extract novel information that addresses fundamental questions of biological interest. One of the ways to approach microarray data mining is to increase the number of dimensions/levels to the analysis, such as comparing independent studies from different genotypes. However, adding dimensions also creates a challenge in finding novel ways for analyzing multi-dimensional microarray data. RESULTS: Mining of independent microarray studies from Pima and Upland (TM1) cotton using double feature selection and cluster analyses identified species-specific and stage-specific gene transcripts that argue in favor of discrete genetic mechanisms that govern developmental programming of cotton fiber morphogenesis in these two cultivated species. Double feature selection analysis identified the highest number of differentially expressed genes that distinguish the fiber transcriptomes of developing Pima and TM1 fibers. These results were based on the finding that differences in fibers harvested between 17 and 24 day post-anthesis (dpa) represent the greatest expressional distance between the two species. This powerful selection method identified a subset of genes expressed during primary (PCW) and secondary (SCW) cell wall biogenesis in Pima fibers that exhibits an expression pattern that is generally reversed in TM1 at the same developmental stage. Cluster and functional analyses revealed that this subset of genes are primarily regulated during the transition stage that overlaps the termination of PCW and onset of SCW biogenesis, suggesting that these particular genes play a major role in the genetic mechanism that underlies the phenotypic differences in fiber traits between Pima and TM1. CONCLUSION: The novel application of double feature selection analysis led to the discovery of species- and stage-specific genetic expression patterns, which are biologically relevant to the genetic programs that underlie the differences in the fiber phenotypes in Pima and TM1. These results promise to have profound impacts on the ongoing efforts to improve cotton fiber traits.
format	Text
id	pubmed-2441630
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-24416302008-07-01 Double feature selection and cluster analyses in mining of microarray data from cotton Alabady, Magdy S Youn, Eunseog Wilkins, Thea A BMC Genomics Research Article BACKGROUND: Cotton fiber is a single-celled seed trichome of major biological and economic importance. In recent years, genomic approaches such as microarray-based expression profiling were used to study fiber growth and development to understand the developmental mechanisms of fiber at the molecular level. The vast volume of microarray expression data generated requires a sophisticated means of data mining in order to extract novel information that addresses fundamental questions of biological interest. One of the ways to approach microarray data mining is to increase the number of dimensions/levels to the analysis, such as comparing independent studies from different genotypes. However, adding dimensions also creates a challenge in finding novel ways for analyzing multi-dimensional microarray data. RESULTS: Mining of independent microarray studies from Pima and Upland (TM1) cotton using double feature selection and cluster analyses identified species-specific and stage-specific gene transcripts that argue in favor of discrete genetic mechanisms that govern developmental programming of cotton fiber morphogenesis in these two cultivated species. Double feature selection analysis identified the highest number of differentially expressed genes that distinguish the fiber transcriptomes of developing Pima and TM1 fibers. These results were based on the finding that differences in fibers harvested between 17 and 24 day post-anthesis (dpa) represent the greatest expressional distance between the two species. This powerful selection method identified a subset of genes expressed during primary (PCW) and secondary (SCW) cell wall biogenesis in Pima fibers that exhibits an expression pattern that is generally reversed in TM1 at the same developmental stage. Cluster and functional analyses revealed that this subset of genes are primarily regulated during the transition stage that overlaps the termination of PCW and onset of SCW biogenesis, suggesting that these particular genes play a major role in the genetic mechanism that underlies the phenotypic differences in fiber traits between Pima and TM1. CONCLUSION: The novel application of double feature selection analysis led to the discovery of species- and stage-specific genetic expression patterns, which are biologically relevant to the genetic programs that underlie the differences in the fiber phenotypes in Pima and TM1. These results promise to have profound impacts on the ongoing efforts to improve cotton fiber traits. BioMed Central 2008-06-20 /pmc/articles/PMC2441630/ /pubmed/18570655 http://dx.doi.org/10.1186/1471-2164-9-295 Text en Copyright © 2008 Alabady et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Alabady, Magdy S Youn, Eunseog Wilkins, Thea A Double feature selection and cluster analyses in mining of microarray data from cotton
title	Double feature selection and cluster analyses in mining of microarray data from cotton
title_full	Double feature selection and cluster analyses in mining of microarray data from cotton
title_fullStr	Double feature selection and cluster analyses in mining of microarray data from cotton
title_full_unstemmed	Double feature selection and cluster analyses in mining of microarray data from cotton
title_short	Double feature selection and cluster analyses in mining of microarray data from cotton
title_sort	double feature selection and cluster analyses in mining of microarray data from cotton
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2441630/ https://www.ncbi.nlm.nih.gov/pubmed/18570655 http://dx.doi.org/10.1186/1471-2164-9-295
work_keys_str_mv	AT alabadymagdys doublefeatureselectionandclusteranalysesinminingofmicroarraydatafromcotton AT youneunseog doublefeatureselectionandclusteranalysesinminingofmicroarraydatafromcotton AT wilkinstheaa doublefeatureselectionandclusteranalysesinminingofmicroarraydatafromcotton

Double feature selection and cluster analyses in mining of microarray data from cotton

Ejemplares similares