Cargando…

The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data

EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis a...

Descripción completa

Detalles Bibliográficos
Autores principales: Milnthorpe, Andrew T., Soloviev, Mikhail
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3297614/
https://www.ncbi.nlm.nih.gov/pubmed/22412959
http://dx.doi.org/10.1371/journal.pone.0032966
_version_ 1782225900692045824
author Milnthorpe, Andrew T.
Soloviev, Mikhail
author_facet Milnthorpe, Andrew T.
Soloviev, Mikhail
author_sort Milnthorpe, Andrew T.
collection PubMed
description EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding “tissue-specific” genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging.
format Online
Article
Text
id pubmed-3297614
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32976142012-03-12 The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data Milnthorpe, Andrew T. Soloviev, Mikhail PLoS One Research Article EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding “tissue-specific” genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging. Public Library of Science 2012-03-08 /pmc/articles/PMC3297614/ /pubmed/22412959 http://dx.doi.org/10.1371/journal.pone.0032966 Text en Milnthorpe, Soloviev. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Milnthorpe, Andrew T.
Soloviev, Mikhail
The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
title The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
title_full The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
title_fullStr The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
title_full_unstemmed The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
title_short The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
title_sort use of est expression matrixes for the quality control of gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3297614/
https://www.ncbi.nlm.nih.gov/pubmed/22412959
http://dx.doi.org/10.1371/journal.pone.0032966
work_keys_str_mv AT milnthorpeandrewt theuseofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata
AT solovievmikhail theuseofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata
AT milnthorpeandrewt useofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata
AT solovievmikhail useofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata