Cargando…
The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis a...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3297614/ https://www.ncbi.nlm.nih.gov/pubmed/22412959 http://dx.doi.org/10.1371/journal.pone.0032966 |
_version_ | 1782225900692045824 |
---|---|
author | Milnthorpe, Andrew T. Soloviev, Mikhail |
author_facet | Milnthorpe, Andrew T. Soloviev, Mikhail |
author_sort | Milnthorpe, Andrew T. |
collection | PubMed |
description | EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding “tissue-specific” genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging. |
format | Online Article Text |
id | pubmed-3297614 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-32976142012-03-12 The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data Milnthorpe, Andrew T. Soloviev, Mikhail PLoS One Research Article EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding “tissue-specific” genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging. Public Library of Science 2012-03-08 /pmc/articles/PMC3297614/ /pubmed/22412959 http://dx.doi.org/10.1371/journal.pone.0032966 Text en Milnthorpe, Soloviev. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Milnthorpe, Andrew T. Soloviev, Mikhail The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data |
title | The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data |
title_full | The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data |
title_fullStr | The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data |
title_full_unstemmed | The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data |
title_short | The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data |
title_sort | use of est expression matrixes for the quality control of gene expression data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3297614/ https://www.ncbi.nlm.nih.gov/pubmed/22412959 http://dx.doi.org/10.1371/journal.pone.0032966 |
work_keys_str_mv | AT milnthorpeandrewt theuseofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata AT solovievmikhail theuseofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata AT milnthorpeandrewt useofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata AT solovievmikhail useofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata |