Cargando…

Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for C. elegans

BACKGROUND: Various methods for estimating protein expression levels are known. The level of correlation between these methods is only fair, and systematic biases in each of the methods cannot be ruled out. We here investigate systematic biases in the estimation of gene expression rates from microar...

Descripción completa

Detalles Bibliográficos
Autores principales: Munoz, Enrique T, Bogarad, Leonard D, Deem, Michael W
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC434498/
https://www.ncbi.nlm.nih.gov/pubmed/15134588
http://dx.doi.org/10.1186/1471-2164-5-30
_version_ 1782121517922910208
author Munoz, Enrique T
Bogarad, Leonard D
Deem, Michael W
author_facet Munoz, Enrique T
Bogarad, Leonard D
Deem, Michael W
author_sort Munoz, Enrique T
collection PubMed
description BACKGROUND: Various methods for estimating protein expression levels are known. The level of correlation between these methods is only fair, and systematic biases in each of the methods cannot be ruled out. We here investigate systematic biases in the estimation of gene expression rates from microarray data and from abundance within the Expressed Sequence Tag (EST) database. We suggest that length is a significant factor in biases to measured gene expression rates. As a specific example of the importance of the bias of expression rate with length, we address the following evolutionary question: Does the average C. elegans protein length increase or decrease with expression level? Two different answers to this question have been reported in the literature, one method using expression levels estimated by abundance within the EST database and another using microarrays. We have investigated this issue by constructing the full protein length versus expression curve for C. elegans, using both methods for estimating expression levels. RESULTS: The microarray data show a monotonic decrease of length with expression level, whereas the abundance within the EST database data show a non-monotonic behavior. Furthermore, the ratio of the expression level estimated by the EST database to that measured by microarrays is not constant, but rather systematically biased with gene length. CONCLUSIONS: It is suggested that the length bias may lie primarily in the abundance within the EST database method, being not ameliorated by internal standards as it is in the microarray data, and that this bias should be removed before data interpretation. When this is done, both the microarray and the abundance within the EST database give a monotonic decrease of spliced length with expression level, and the correlation between the EST and microarray data becomes larger. We suggest that standard RNA controls be used to normalize for length bias in any method that measures expression.
format Text
id pubmed-434498
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-4344982004-06-25 Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for C. elegans Munoz, Enrique T Bogarad, Leonard D Deem, Michael W BMC Genomics Research Article BACKGROUND: Various methods for estimating protein expression levels are known. The level of correlation between these methods is only fair, and systematic biases in each of the methods cannot be ruled out. We here investigate systematic biases in the estimation of gene expression rates from microarray data and from abundance within the Expressed Sequence Tag (EST) database. We suggest that length is a significant factor in biases to measured gene expression rates. As a specific example of the importance of the bias of expression rate with length, we address the following evolutionary question: Does the average C. elegans protein length increase or decrease with expression level? Two different answers to this question have been reported in the literature, one method using expression levels estimated by abundance within the EST database and another using microarrays. We have investigated this issue by constructing the full protein length versus expression curve for C. elegans, using both methods for estimating expression levels. RESULTS: The microarray data show a monotonic decrease of length with expression level, whereas the abundance within the EST database data show a non-monotonic behavior. Furthermore, the ratio of the expression level estimated by the EST database to that measured by microarrays is not constant, but rather systematically biased with gene length. CONCLUSIONS: It is suggested that the length bias may lie primarily in the abundance within the EST database method, being not ameliorated by internal standards as it is in the microarray data, and that this bias should be removed before data interpretation. When this is done, both the microarray and the abundance within the EST database give a monotonic decrease of spliced length with expression level, and the correlation between the EST and microarray data becomes larger. We suggest that standard RNA controls be used to normalize for length bias in any method that measures expression. BioMed Central 2004-05-10 /pmc/articles/PMC434498/ /pubmed/15134588 http://dx.doi.org/10.1186/1471-2164-5-30 Text en Copyright © 2004 Munoz et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Research Article
Munoz, Enrique T
Bogarad, Leonard D
Deem, Michael W
Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for C. elegans
title Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for C. elegans
title_full Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for C. elegans
title_fullStr Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for C. elegans
title_full_unstemmed Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for C. elegans
title_short Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for C. elegans
title_sort microarray and est database estimates of mrna expression levels differ: the protein length versus expression curve for c. elegans
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC434498/
https://www.ncbi.nlm.nih.gov/pubmed/15134588
http://dx.doi.org/10.1186/1471-2164-5-30
work_keys_str_mv AT munozenriquet microarrayandestdatabaseestimatesofmrnaexpressionlevelsdiffertheproteinlengthversusexpressioncurveforcelegans
AT bogaradleonardd microarrayandestdatabaseestimatesofmrnaexpressionlevelsdiffertheproteinlengthversusexpressioncurveforcelegans
AT deemmichaelw microarrayandestdatabaseestimatesofmrnaexpressionlevelsdiffertheproteinlengthversusexpressioncurveforcelegans