Cargando…

Quality determination and the repair of poor quality spots in array experiments

BACKGROUND: A common feature of microarray experiments is the occurence of missing gene expression data. These missing values occur for a variety of reasons, in particular, because of the filtering of poor quality spots and the removal of undefined values when a logarithmic transformation is applied...

Descripción completa

Detalles Bibliográficos
Autores principales: Tom, Brian DM, Gilks, Walter R, Brooke-Powell, Elizabeth T, Ajioka, James W
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1262693/
https://www.ncbi.nlm.nih.gov/pubmed/16185360
http://dx.doi.org/10.1186/1471-2105-6-234
_version_ 1782125883941715968
author Tom, Brian DM
Gilks, Walter R
Brooke-Powell, Elizabeth T
Ajioka, James W
author_facet Tom, Brian DM
Gilks, Walter R
Brooke-Powell, Elizabeth T
Ajioka, James W
author_sort Tom, Brian DM
collection PubMed
description BACKGROUND: A common feature of microarray experiments is the occurence of missing gene expression data. These missing values occur for a variety of reasons, in particular, because of the filtering of poor quality spots and the removal of undefined values when a logarithmic transformation is applied to negative background-corrected intensities. The efficiency and power of an analysis performed can be substantially reduced by having an incomplete matrix of gene intensities. Additionally, most statistical methods require a complete intensity matrix. Furthermore, biases may be introduced into analyses through missing information on some genes. Thus methods for appropriately replacing (imputing) missing data and/or weighting poor quality spots are required. RESULTS: We present a likelihood-based method for imputing missing data or weighting poor quality spots that requires a number of biological or technical replicates. This likelihood-based approach assumes that the data for a given spot arising from each channel of a two-dye (two-channel) cDNA microarray comparison experiment independently come from a three-component mixture distribution – the parameters of which are estimated through use of a constrained E-M algorithm. Posterior probabilities of belonging to each component of the mixture distributions are calculated and used to decide whether imputation is required. These posterior probabilities may also be used to construct quality weights that can down-weight poor quality spots in any analysis performed afterwards. The approach is illustrated using data obtained from an experiment to observe gene expression changes with 24 hr paclitaxel (Taxol (®)) treatment on a human cervical cancer derived cell line (HeLa). CONCLUSION: As the quality of microarray experiments affect downstream processes, it is important to have a reliable and automatic method of identifying poor quality spots and arrays. We propose a method of identifying poor quality spots, and suggest a method of repairing the arrays by either imputation or assigning quality weights to the spots. This repaired data set would be less biased and can be analysed using any of the appropriate statistical methods found in the microarray literature.
format Text
id pubmed-1262693
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-12626932005-10-22 Quality determination and the repair of poor quality spots in array experiments Tom, Brian DM Gilks, Walter R Brooke-Powell, Elizabeth T Ajioka, James W BMC Bioinformatics Methodology Article BACKGROUND: A common feature of microarray experiments is the occurence of missing gene expression data. These missing values occur for a variety of reasons, in particular, because of the filtering of poor quality spots and the removal of undefined values when a logarithmic transformation is applied to negative background-corrected intensities. The efficiency and power of an analysis performed can be substantially reduced by having an incomplete matrix of gene intensities. Additionally, most statistical methods require a complete intensity matrix. Furthermore, biases may be introduced into analyses through missing information on some genes. Thus methods for appropriately replacing (imputing) missing data and/or weighting poor quality spots are required. RESULTS: We present a likelihood-based method for imputing missing data or weighting poor quality spots that requires a number of biological or technical replicates. This likelihood-based approach assumes that the data for a given spot arising from each channel of a two-dye (two-channel) cDNA microarray comparison experiment independently come from a three-component mixture distribution – the parameters of which are estimated through use of a constrained E-M algorithm. Posterior probabilities of belonging to each component of the mixture distributions are calculated and used to decide whether imputation is required. These posterior probabilities may also be used to construct quality weights that can down-weight poor quality spots in any analysis performed afterwards. The approach is illustrated using data obtained from an experiment to observe gene expression changes with 24 hr paclitaxel (Taxol (®)) treatment on a human cervical cancer derived cell line (HeLa). CONCLUSION: As the quality of microarray experiments affect downstream processes, it is important to have a reliable and automatic method of identifying poor quality spots and arrays. We propose a method of identifying poor quality spots, and suggest a method of repairing the arrays by either imputation or assigning quality weights to the spots. This repaired data set would be less biased and can be analysed using any of the appropriate statistical methods found in the microarray literature. BioMed Central 2005-09-26 /pmc/articles/PMC1262693/ /pubmed/16185360 http://dx.doi.org/10.1186/1471-2105-6-234 Text en Copyright © 2005 Tom et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Tom, Brian DM
Gilks, Walter R
Brooke-Powell, Elizabeth T
Ajioka, James W
Quality determination and the repair of poor quality spots in array experiments
title Quality determination and the repair of poor quality spots in array experiments
title_full Quality determination and the repair of poor quality spots in array experiments
title_fullStr Quality determination and the repair of poor quality spots in array experiments
title_full_unstemmed Quality determination and the repair of poor quality spots in array experiments
title_short Quality determination and the repair of poor quality spots in array experiments
title_sort quality determination and the repair of poor quality spots in array experiments
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1262693/
https://www.ncbi.nlm.nih.gov/pubmed/16185360
http://dx.doi.org/10.1186/1471-2105-6-234
work_keys_str_mv AT tombriandm qualitydeterminationandtherepairofpoorqualityspotsinarrayexperiments
AT gilkswalterr qualitydeterminationandtherepairofpoorqualityspotsinarrayexperiments
AT brookepowellelizabetht qualitydeterminationandtherepairofpoorqualityspotsinarrayexperiments
AT ajiokajamesw qualitydeterminationandtherepairofpoorqualityspotsinarrayexperiments