Cargando…

A multivariate prediction model for microarray cross-hybridization

BACKGROUND: Expression microarray analysis is one of the most popular molecular diagnostic techniques in the post-genomic era. However, this technique faces the fundamental problem of potential cross-hybridization. This is a pervasive problem for both oligonucleotide and cDNA microarrays; it is cons...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yian A, Chou, Cheng-Chung, Lu, Xinghua, Slate, Elizabeth H, Peck, Konan, Xu, Wenying, Voit, Eberhard O, Almeida, Jonas S
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1409802/
https://www.ncbi.nlm.nih.gov/pubmed/16509965
http://dx.doi.org/10.1186/1471-2105-7-101
_version_ 1782127058534531072
author Chen, Yian A
Chou, Cheng-Chung
Lu, Xinghua
Slate, Elizabeth H
Peck, Konan
Xu, Wenying
Voit, Eberhard O
Almeida, Jonas S
author_facet Chen, Yian A
Chou, Cheng-Chung
Lu, Xinghua
Slate, Elizabeth H
Peck, Konan
Xu, Wenying
Voit, Eberhard O
Almeida, Jonas S
author_sort Chen, Yian A
collection PubMed
description BACKGROUND: Expression microarray analysis is one of the most popular molecular diagnostic techniques in the post-genomic era. However, this technique faces the fundamental problem of potential cross-hybridization. This is a pervasive problem for both oligonucleotide and cDNA microarrays; it is considered particularly problematic for the latter. No comprehensive multivariate predictive modeling has been performed to understand how multiple variables contribute to (cross-) hybridization. RESULTS: We propose a systematic search strategy using multiple multivariate models [multiple linear regressions, regression trees, and artificial neural network analyses (ANNs)] to select an effective set of predictors for hybridization. We validate this approach on a set of DNA microarrays with cytochrome p450 family genes. The performance of our multiple multivariate models is compared with that of a recently proposed third-order polynomial regression method that uses percent identity as the sole predictor. All multivariate models agree that the 'most contiguous base pairs between probe and target sequences,' rather than percent identity, is the best univariate predictor. The predictive power is improved by inclusion of additional nonlinear effects, in particular target GC content, when regression trees or ANNs are used. CONCLUSION: A systematic multivariate approach is provided to assess the importance of multiple sequence features for hybridization and of relationships among these features. This approach can easily be applied to larger datasets. This will allow future developments of generalized hybridization models that will be able to correct for false-positive cross-hybridization signals in expression experiments.
format Text
id pubmed-1409802
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14098022006-04-21 A multivariate prediction model for microarray cross-hybridization Chen, Yian A Chou, Cheng-Chung Lu, Xinghua Slate, Elizabeth H Peck, Konan Xu, Wenying Voit, Eberhard O Almeida, Jonas S BMC Bioinformatics Research Article BACKGROUND: Expression microarray analysis is one of the most popular molecular diagnostic techniques in the post-genomic era. However, this technique faces the fundamental problem of potential cross-hybridization. This is a pervasive problem for both oligonucleotide and cDNA microarrays; it is considered particularly problematic for the latter. No comprehensive multivariate predictive modeling has been performed to understand how multiple variables contribute to (cross-) hybridization. RESULTS: We propose a systematic search strategy using multiple multivariate models [multiple linear regressions, regression trees, and artificial neural network analyses (ANNs)] to select an effective set of predictors for hybridization. We validate this approach on a set of DNA microarrays with cytochrome p450 family genes. The performance of our multiple multivariate models is compared with that of a recently proposed third-order polynomial regression method that uses percent identity as the sole predictor. All multivariate models agree that the 'most contiguous base pairs between probe and target sequences,' rather than percent identity, is the best univariate predictor. The predictive power is improved by inclusion of additional nonlinear effects, in particular target GC content, when regression trees or ANNs are used. CONCLUSION: A systematic multivariate approach is provided to assess the importance of multiple sequence features for hybridization and of relationships among these features. This approach can easily be applied to larger datasets. This will allow future developments of generalized hybridization models that will be able to correct for false-positive cross-hybridization signals in expression experiments. BioMed Central 2006-03-01 /pmc/articles/PMC1409802/ /pubmed/16509965 http://dx.doi.org/10.1186/1471-2105-7-101 Text en Copyright © 2006 Chen et al; licensee BioMed Central Ltd.
spellingShingle Research Article
Chen, Yian A
Chou, Cheng-Chung
Lu, Xinghua
Slate, Elizabeth H
Peck, Konan
Xu, Wenying
Voit, Eberhard O
Almeida, Jonas S
A multivariate prediction model for microarray cross-hybridization
title A multivariate prediction model for microarray cross-hybridization
title_full A multivariate prediction model for microarray cross-hybridization
title_fullStr A multivariate prediction model for microarray cross-hybridization
title_full_unstemmed A multivariate prediction model for microarray cross-hybridization
title_short A multivariate prediction model for microarray cross-hybridization
title_sort multivariate prediction model for microarray cross-hybridization
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1409802/
https://www.ncbi.nlm.nih.gov/pubmed/16509965
http://dx.doi.org/10.1186/1471-2105-7-101
work_keys_str_mv AT chenyiana amultivariatepredictionmodelformicroarraycrosshybridization
AT chouchengchung amultivariatepredictionmodelformicroarraycrosshybridization
AT luxinghua amultivariatepredictionmodelformicroarraycrosshybridization
AT slateelizabethh amultivariatepredictionmodelformicroarraycrosshybridization
AT peckkonan amultivariatepredictionmodelformicroarraycrosshybridization
AT xuwenying amultivariatepredictionmodelformicroarraycrosshybridization
AT voiteberhardo amultivariatepredictionmodelformicroarraycrosshybridization
AT almeidajonass amultivariatepredictionmodelformicroarraycrosshybridization
AT chenyiana multivariatepredictionmodelformicroarraycrosshybridization
AT chouchengchung multivariatepredictionmodelformicroarraycrosshybridization
AT luxinghua multivariatepredictionmodelformicroarraycrosshybridization
AT slateelizabethh multivariatepredictionmodelformicroarraycrosshybridization
AT peckkonan multivariatepredictionmodelformicroarraycrosshybridization
AT xuwenying multivariatepredictionmodelformicroarraycrosshybridization
AT voiteberhardo multivariatepredictionmodelformicroarraycrosshybridization
AT almeidajonass multivariatepredictionmodelformicroarraycrosshybridization