Cargando…

Evaluating annotations of an Agilent expression chip suggests that many features cannot be interpreted

BACKGROUND: While attempting to reanalyze published data from Agilent 4 × 44 human expression chips, we found that some of the 60-mer olignucleotide features could not be interpreted as representing single human genes. For example, some of the oligonucleotides align with the transcripts of more than...

Descripción completa

Detalles Bibliográficos
Autores principales: Gertz, E Michael, Sengupta, Kundan, Difilippantonio, Michael J, Ried, Thomas, Schäffer, Alejandro A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2791105/
https://www.ncbi.nlm.nih.gov/pubmed/19948035
http://dx.doi.org/10.1186/1471-2164-10-566
_version_ 1782175159932682240
author Gertz, E Michael
Sengupta, Kundan
Difilippantonio, Michael J
Ried, Thomas
Schäffer, Alejandro A
author_facet Gertz, E Michael
Sengupta, Kundan
Difilippantonio, Michael J
Ried, Thomas
Schäffer, Alejandro A
author_sort Gertz, E Michael
collection PubMed
description BACKGROUND: While attempting to reanalyze published data from Agilent 4 × 44 human expression chips, we found that some of the 60-mer olignucleotide features could not be interpreted as representing single human genes. For example, some of the oligonucleotides align with the transcripts of more than one gene. We decided to check the annotations for all autosomes and the X chromosome systematically using bioinformatics methods. RESULTS: Out of 42683 reporters, we found that 25505 (60%) passed all our tests and are considered "fully valid". 9964 (23%) reporters did not have a meaningful identifier, mapped to the wrong chromosome, or did not pass basic alignment tests preventing us from correlating the expression values of these reporters with a unique annotated human gene. The remaining 7214 (17%) reporters could be associated with either a unique gene or a unique intergenic location, but could not be mapped to a transcript in RefSeq. The 7214 reporters are further partitioned into three different levels of validity. CONCLUSION: Expression array studies should evaluate the annotations of reporters and remove those reporters that have suspect annotations. This evaluation can be done systematically and semi-automatically, but one must recognize that data sources are frequently updated leading to slightly changing validation results over time.
format Text
id pubmed-2791105
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27911052009-12-10 Evaluating annotations of an Agilent expression chip suggests that many features cannot be interpreted Gertz, E Michael Sengupta, Kundan Difilippantonio, Michael J Ried, Thomas Schäffer, Alejandro A BMC Genomics Research article BACKGROUND: While attempting to reanalyze published data from Agilent 4 × 44 human expression chips, we found that some of the 60-mer olignucleotide features could not be interpreted as representing single human genes. For example, some of the oligonucleotides align with the transcripts of more than one gene. We decided to check the annotations for all autosomes and the X chromosome systematically using bioinformatics methods. RESULTS: Out of 42683 reporters, we found that 25505 (60%) passed all our tests and are considered "fully valid". 9964 (23%) reporters did not have a meaningful identifier, mapped to the wrong chromosome, or did not pass basic alignment tests preventing us from correlating the expression values of these reporters with a unique annotated human gene. The remaining 7214 (17%) reporters could be associated with either a unique gene or a unique intergenic location, but could not be mapped to a transcript in RefSeq. The 7214 reporters are further partitioned into three different levels of validity. CONCLUSION: Expression array studies should evaluate the annotations of reporters and remove those reporters that have suspect annotations. This evaluation can be done systematically and semi-automatically, but one must recognize that data sources are frequently updated leading to slightly changing validation results over time. BioMed Central 2009-11-30 /pmc/articles/PMC2791105/ /pubmed/19948035 http://dx.doi.org/10.1186/1471-2164-10-566 Text en Copyright ©2009 Gertz et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Gertz, E Michael
Sengupta, Kundan
Difilippantonio, Michael J
Ried, Thomas
Schäffer, Alejandro A
Evaluating annotations of an Agilent expression chip suggests that many features cannot be interpreted
title Evaluating annotations of an Agilent expression chip suggests that many features cannot be interpreted
title_full Evaluating annotations of an Agilent expression chip suggests that many features cannot be interpreted
title_fullStr Evaluating annotations of an Agilent expression chip suggests that many features cannot be interpreted
title_full_unstemmed Evaluating annotations of an Agilent expression chip suggests that many features cannot be interpreted
title_short Evaluating annotations of an Agilent expression chip suggests that many features cannot be interpreted
title_sort evaluating annotations of an agilent expression chip suggests that many features cannot be interpreted
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2791105/
https://www.ncbi.nlm.nih.gov/pubmed/19948035
http://dx.doi.org/10.1186/1471-2164-10-566
work_keys_str_mv AT gertzemichael evaluatingannotationsofanagilentexpressionchipsuggeststhatmanyfeaturescannotbeinterpreted
AT senguptakundan evaluatingannotationsofanagilentexpressionchipsuggeststhatmanyfeaturescannotbeinterpreted
AT difilippantoniomichaelj evaluatingannotationsofanagilentexpressionchipsuggeststhatmanyfeaturescannotbeinterpreted
AT riedthomas evaluatingannotationsofanagilentexpressionchipsuggeststhatmanyfeaturescannotbeinterpreted
AT schafferalejandroa evaluatingannotationsofanagilentexpressionchipsuggeststhatmanyfeaturescannotbeinterpreted