Cargando…

Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information

BACKGROUND: Expression microarrays are increasingly used to characterize environmental responses and host-parasite interactions for many different organisms. Probe selection for cDNA microarrays using expressed sequence tags (ESTs) is challenging due to high sequence redundancy and potential cross-h...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yian A, Mckillen, David J, Wu, Shuyuan, Jenny, Matthew J, Chapman, Robert, Gross, Paul S, Warr, Gregory W, Almeida, Jonas S
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC539232/
https://www.ncbi.nlm.nih.gov/pubmed/15585062
http://dx.doi.org/10.1186/1471-2105-5-191
_version_ 1782122064332718080
author Chen, Yian A
Mckillen, David J
Wu, Shuyuan
Jenny, Matthew J
Chapman, Robert
Gross, Paul S
Warr, Gregory W
Almeida, Jonas S
author_facet Chen, Yian A
Mckillen, David J
Wu, Shuyuan
Jenny, Matthew J
Chapman, Robert
Gross, Paul S
Warr, Gregory W
Almeida, Jonas S
author_sort Chen, Yian A
collection PubMed
description BACKGROUND: Expression microarrays are increasingly used to characterize environmental responses and host-parasite interactions for many different organisms. Probe selection for cDNA microarrays using expressed sequence tags (ESTs) is challenging due to high sequence redundancy and potential cross-hybridization between paralogous genes. In organisms with limited genomic information, like marine organisms, this challenge is even greater due to annotation uncertainty. No general tool is available for cDNA microarray probe selection for these organisms. Therefore, the goal of the design procedure described here is to select a subset of ESTs that will minimize sequence redundancy and characterize potential cross-hybridization while providing functionally representative probes. RESULTS: Sequence similarity between ESTs, quantified by the E-value of pair-wise alignment, was used as a surrogate for expected hybridization between corresponding sequences. Using this value as a measure of dissimilarity, sequence redundancy reduction was performed by hierarchical cluster analyses. The choice of how many microarray probes to retain was made based on an index developed for this research: a sequence diversity index (SDI) within a sequence diversity plot (SDP). This index tracked the decreasing within-cluster sequence diversity as the number of clusters increased. For a given stage in the agglomeration procedure, the EST having the highest similarity to all the other sequences within each cluster, the centroid EST, was selected as a microarray probe. A small dataset of ESTs from Atlantic white shrimp (Litopenaeus setiferus) was used to test this algorithm so that the detailed results could be examined. The functional representative level of the selected probes was quantified using Gene Ontology (GO) annotations. CONCLUSIONS: For organisms with limited genomic information, combining hierarchical clustering methods to analyze ESTs can yield an optimal cDNA microarray design. If biomarker discovery is the goal of the microarray experiments, the average linkage method is more effective, while single linkage is more suitable if identification of physiological mechanisms is more of interest. This general design procedure is not limited to designing single-species cDNA microarrays for marine organisms, and it can equally be applied to multiple-species microarrays of any organisms with limited genomic information.
format Text
id pubmed-539232
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5392322004-12-24 Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information Chen, Yian A Mckillen, David J Wu, Shuyuan Jenny, Matthew J Chapman, Robert Gross, Paul S Warr, Gregory W Almeida, Jonas S BMC Bioinformatics Research Article BACKGROUND: Expression microarrays are increasingly used to characterize environmental responses and host-parasite interactions for many different organisms. Probe selection for cDNA microarrays using expressed sequence tags (ESTs) is challenging due to high sequence redundancy and potential cross-hybridization between paralogous genes. In organisms with limited genomic information, like marine organisms, this challenge is even greater due to annotation uncertainty. No general tool is available for cDNA microarray probe selection for these organisms. Therefore, the goal of the design procedure described here is to select a subset of ESTs that will minimize sequence redundancy and characterize potential cross-hybridization while providing functionally representative probes. RESULTS: Sequence similarity between ESTs, quantified by the E-value of pair-wise alignment, was used as a surrogate for expected hybridization between corresponding sequences. Using this value as a measure of dissimilarity, sequence redundancy reduction was performed by hierarchical cluster analyses. The choice of how many microarray probes to retain was made based on an index developed for this research: a sequence diversity index (SDI) within a sequence diversity plot (SDP). This index tracked the decreasing within-cluster sequence diversity as the number of clusters increased. For a given stage in the agglomeration procedure, the EST having the highest similarity to all the other sequences within each cluster, the centroid EST, was selected as a microarray probe. A small dataset of ESTs from Atlantic white shrimp (Litopenaeus setiferus) was used to test this algorithm so that the detailed results could be examined. The functional representative level of the selected probes was quantified using Gene Ontology (GO) annotations. CONCLUSIONS: For organisms with limited genomic information, combining hierarchical clustering methods to analyze ESTs can yield an optimal cDNA microarray design. If biomarker discovery is the goal of the microarray experiments, the average linkage method is more effective, while single linkage is more suitable if identification of physiological mechanisms is more of interest. This general design procedure is not limited to designing single-species cDNA microarrays for marine organisms, and it can equally be applied to multiple-species microarrays of any organisms with limited genomic information. BioMed Central 2004-12-07 /pmc/articles/PMC539232/ /pubmed/15585062 http://dx.doi.org/10.1186/1471-2105-5-191 Text en Copyright © 2004 Chen et al; licensee BioMed Central Ltd.
spellingShingle Research Article
Chen, Yian A
Mckillen, David J
Wu, Shuyuan
Jenny, Matthew J
Chapman, Robert
Gross, Paul S
Warr, Gregory W
Almeida, Jonas S
Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information
title Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information
title_full Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information
title_fullStr Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information
title_full_unstemmed Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information
title_short Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information
title_sort optimal cdna microarray design using expressed sequence tags for organisms with limited genomic information
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC539232/
https://www.ncbi.nlm.nih.gov/pubmed/15585062
http://dx.doi.org/10.1186/1471-2105-5-191
work_keys_str_mv AT chenyiana optimalcdnamicroarraydesignusingexpressedsequencetagsfororganismswithlimitedgenomicinformation
AT mckillendavidj optimalcdnamicroarraydesignusingexpressedsequencetagsfororganismswithlimitedgenomicinformation
AT wushuyuan optimalcdnamicroarraydesignusingexpressedsequencetagsfororganismswithlimitedgenomicinformation
AT jennymatthewj optimalcdnamicroarraydesignusingexpressedsequencetagsfororganismswithlimitedgenomicinformation
AT chapmanrobert optimalcdnamicroarraydesignusingexpressedsequencetagsfororganismswithlimitedgenomicinformation
AT grosspauls optimalcdnamicroarraydesignusingexpressedsequencetagsfororganismswithlimitedgenomicinformation
AT warrgregoryw optimalcdnamicroarraydesignusingexpressedsequencetagsfororganismswithlimitedgenomicinformation
AT almeidajonass optimalcdnamicroarraydesignusingexpressedsequencetagsfororganismswithlimitedgenomicinformation