Cargando…

Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation

BACKGROUND: Publicly accessible EST libraries contain valuable information that can be utilized for studies of tissue-specific gene expression and processing of individual genes. This information is, however, confounded by multiple systematic effects arising from the procedures used to generate thes...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Donglin, Graber, Joel H
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1431573/
https://www.ncbi.nlm.nih.gov/pubmed/16503995
http://dx.doi.org/10.1186/1471-2105-7-77
_version_ 1782127208625602560
author Liu, Donglin
Graber, Joel H
author_facet Liu, Donglin
Graber, Joel H
author_sort Liu, Donglin
collection PubMed
description BACKGROUND: Publicly accessible EST libraries contain valuable information that can be utilized for studies of tissue-specific gene expression and processing of individual genes. This information is, however, confounded by multiple systematic effects arising from the procedures used to generate these libraries. RESULTS: We used alignment of ESTs against a reference set of transcripts to estimate the size distributions of the cDNA inserts and sampled mRNA transcripts in individual EST libraries and show how these measurements can be used to inform quantitative comparisons of libraries. While significant attention has been paid to the effects of normalization and substraction, we also find significant biases in transcript sampling introduced by the combined procedures of reverse transcription and selection of cDNA clones for sequencing. Using examples drawn from studies of mRNA 3'-processing (cleavage and polyadenylation), we demonstrate effects of the transcript sampling bias, and provide a method for identifying libraries that can be safely compared without bias. All data sets, supplemental data, and software are available at our supplemental web site [1]. CONCLUSION: The biases we characterize in the transcript sampling of EST libraries represent a significant and heretofore under-appreciated source of false positive candidates for tissue-, cell type-, or developmental stage-specific activity or processing of genes. Uncorrected, quantitative comparison of dissimilar EST libraries will likely result in the identification of statistically significant, but biologically meaningless changes.
format Text
id pubmed-1431573
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14315732006-04-21 Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation Liu, Donglin Graber, Joel H BMC Bioinformatics Research Article BACKGROUND: Publicly accessible EST libraries contain valuable information that can be utilized for studies of tissue-specific gene expression and processing of individual genes. This information is, however, confounded by multiple systematic effects arising from the procedures used to generate these libraries. RESULTS: We used alignment of ESTs against a reference set of transcripts to estimate the size distributions of the cDNA inserts and sampled mRNA transcripts in individual EST libraries and show how these measurements can be used to inform quantitative comparisons of libraries. While significant attention has been paid to the effects of normalization and substraction, we also find significant biases in transcript sampling introduced by the combined procedures of reverse transcription and selection of cDNA clones for sequencing. Using examples drawn from studies of mRNA 3'-processing (cleavage and polyadenylation), we demonstrate effects of the transcript sampling bias, and provide a method for identifying libraries that can be safely compared without bias. All data sets, supplemental data, and software are available at our supplemental web site [1]. CONCLUSION: The biases we characterize in the transcript sampling of EST libraries represent a significant and heretofore under-appreciated source of false positive candidates for tissue-, cell type-, or developmental stage-specific activity or processing of genes. Uncorrected, quantitative comparison of dissimilar EST libraries will likely result in the identification of statistically significant, but biologically meaningless changes. BioMed Central 2006-02-17 /pmc/articles/PMC1431573/ /pubmed/16503995 http://dx.doi.org/10.1186/1471-2105-7-77 Text en Copyright © 2006 Liu and Graber; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Liu, Donglin
Graber, Joel H
Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_full Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_fullStr Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_full_unstemmed Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_short Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_sort quantitative comparison of est libraries requires compensation for systematic biases in cdna generation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1431573/
https://www.ncbi.nlm.nih.gov/pubmed/16503995
http://dx.doi.org/10.1186/1471-2105-7-77
work_keys_str_mv AT liudonglin quantitativecomparisonofestlibrariesrequirescompensationforsystematicbiasesincdnageneration
AT graberjoelh quantitativecomparisonofestlibrariesrequirescompensationforsystematicbiasesincdnageneration