Cargando…

Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest

BACKGROUND: Since the introduction of microarrays in 1995, researchers world-wide have used both commercial and custom-designed microarrays for understanding differential expression of transcribed genes. Public databases such as ArrayExpress and the Gene Expression Omnibus (GEO) have made millions o...

Descripción completa

Detalles Bibliográficos
Autores principales: Saka, Ernur, Harrison, Benjamin J., West, Kirk, Petruska, Jeffrey C., Rouchka, Eric C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5731501/
https://www.ncbi.nlm.nih.gov/pubmed/29244006
http://dx.doi.org/10.1186/s12864-017-4266-5
_version_ 1783286522527612928
author Saka, Ernur
Harrison, Benjamin J.
West, Kirk
Petruska, Jeffrey C.
Rouchka, Eric C.
author_facet Saka, Ernur
Harrison, Benjamin J.
West, Kirk
Petruska, Jeffrey C.
Rouchka, Eric C.
author_sort Saka, Ernur
collection PubMed
description BACKGROUND: Since the introduction of microarrays in 1995, researchers world-wide have used both commercial and custom-designed microarrays for understanding differential expression of transcribed genes. Public databases such as ArrayExpress and the Gene Expression Omnibus (GEO) have made millions of samples readily available. One main drawback to microarray data analysis involves the selection of probes to represent a specific transcript of interest, particularly in light of the fact that transcript-specific knowledge (notably alternative splicing) is dynamic in nature. RESULTS: We therefore developed a framework for reannotating and reassigning probe groups for Affymetrix® GeneChip® technology based on functional regions of interest. This framework addresses three issues of Affymetrix® GeneChip® data analyses: removing nonspecific probes, updating probe target mapping based on the latest genome knowledge and grouping probes into gene, transcript and region-based (UTR, individual exon, CDS) probe sets. Updated gene and transcript probe sets provide more specific analysis results based on current genomic and transcriptomic knowledge. The framework selects unique probes, aligns them to gene annotations and generates a custom Chip Description File (CDF). The analysis reveals only 87% of the Affymetrix® GeneChip® HG-U133 Plus 2 probes uniquely align to the current hg38 human assembly without mismatches. We also tested new mappings on the publicly available data series using rat and human data from GSE48611 and GSE72551 obtained from GEO, and illustrate that functional grouping allows for the subtle detection of regions of interest likely to have phenotypical consequences. CONCLUSION: Through reanalysis of the publicly available data series GSE48611 and GSE72551, we profiled the contribution of UTR and CDS regions to the gene expression levels globally. The comparison between region and gene based results indicated that the detected expressed genes by gene-based and region-based CDFs show high consistency and regions based results allows us to detection of changes in transcript formation.
format Online
Article
Text
id pubmed-5731501
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57315012017-12-19 Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest Saka, Ernur Harrison, Benjamin J. West, Kirk Petruska, Jeffrey C. Rouchka, Eric C. BMC Genomics Research BACKGROUND: Since the introduction of microarrays in 1995, researchers world-wide have used both commercial and custom-designed microarrays for understanding differential expression of transcribed genes. Public databases such as ArrayExpress and the Gene Expression Omnibus (GEO) have made millions of samples readily available. One main drawback to microarray data analysis involves the selection of probes to represent a specific transcript of interest, particularly in light of the fact that transcript-specific knowledge (notably alternative splicing) is dynamic in nature. RESULTS: We therefore developed a framework for reannotating and reassigning probe groups for Affymetrix® GeneChip® technology based on functional regions of interest. This framework addresses three issues of Affymetrix® GeneChip® data analyses: removing nonspecific probes, updating probe target mapping based on the latest genome knowledge and grouping probes into gene, transcript and region-based (UTR, individual exon, CDS) probe sets. Updated gene and transcript probe sets provide more specific analysis results based on current genomic and transcriptomic knowledge. The framework selects unique probes, aligns them to gene annotations and generates a custom Chip Description File (CDF). The analysis reveals only 87% of the Affymetrix® GeneChip® HG-U133 Plus 2 probes uniquely align to the current hg38 human assembly without mismatches. We also tested new mappings on the publicly available data series using rat and human data from GSE48611 and GSE72551 obtained from GEO, and illustrate that functional grouping allows for the subtle detection of regions of interest likely to have phenotypical consequences. CONCLUSION: Through reanalysis of the publicly available data series GSE48611 and GSE72551, we profiled the contribution of UTR and CDS regions to the gene expression levels globally. The comparison between region and gene based results indicated that the detected expressed genes by gene-based and region-based CDFs show high consistency and regions based results allows us to detection of changes in transcript formation. BioMed Central 2017-12-06 /pmc/articles/PMC5731501/ /pubmed/29244006 http://dx.doi.org/10.1186/s12864-017-4266-5 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Saka, Ernur
Harrison, Benjamin J.
West, Kirk
Petruska, Jeffrey C.
Rouchka, Eric C.
Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest
title Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest
title_full Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest
title_fullStr Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest
title_full_unstemmed Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest
title_short Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest
title_sort framework for reanalysis of publicly available affymetrix® genechip® data sets based on functional regions of interest
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5731501/
https://www.ncbi.nlm.nih.gov/pubmed/29244006
http://dx.doi.org/10.1186/s12864-017-4266-5
work_keys_str_mv AT sakaernur frameworkforreanalysisofpubliclyavailableaffymetrixgenechipdatasetsbasedonfunctionalregionsofinterest
AT harrisonbenjaminj frameworkforreanalysisofpubliclyavailableaffymetrixgenechipdatasetsbasedonfunctionalregionsofinterest
AT westkirk frameworkforreanalysisofpubliclyavailableaffymetrixgenechipdatasetsbasedonfunctionalregionsofinterest
AT petruskajeffreyc frameworkforreanalysisofpubliclyavailableaffymetrixgenechipdatasetsbasedonfunctionalregionsofinterest
AT rouchkaericc frameworkforreanalysisofpubliclyavailableaffymetrixgenechipdatasetsbasedonfunctionalregionsofinterest