Cargando…

Identifying differential exon splicing using linear models and correlation coefficients

BACKGROUND: With the availability of the Affymetrix exon arrays a number of tools have been developed to enable the analysis. These however can be expensive or have several pre-installation requirements. This led us to develop an analysis workflow for analysing differential splicing using freely ava...

Descripción completa

Detalles Bibliográficos
Autores principales: Shah, Sonia H, Pallas, Jacqueline A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2636774/
https://www.ncbi.nlm.nih.gov/pubmed/19154578
http://dx.doi.org/10.1186/1471-2105-10-26
_version_ 1782164302651719680
author Shah, Sonia H
Pallas, Jacqueline A
author_facet Shah, Sonia H
Pallas, Jacqueline A
author_sort Shah, Sonia H
collection PubMed
description BACKGROUND: With the availability of the Affymetrix exon arrays a number of tools have been developed to enable the analysis. These however can be expensive or have several pre-installation requirements. This led us to develop an analysis workflow for analysing differential splicing using freely available software packages that are already being widely used for gene expression analysis. The workflow uses the packages in the standard installation of R and Bioconductor (BiocLite) to identify differential splicing. We use the splice index method with the LIMMA framework. The main drawback with this approach is that it relies on accurate estimates of gene expression from the probe-level data. Methods such as RMA and PLIER may misestimate when a large proportion of exons are spliced. We therefore present the novel concept of a gene correlation coefficient calculated using only the probeset expression pattern within a gene. We show that genes with lower correlation coefficients are likely to be differentially spliced. RESULTS: The LIMMA approach was used to identify several tissue-specific transcripts and splicing events that are supported by previous experimental studies. Filtering the data is necessary, particularly removing exons and genes that are not expressed in all samples and cross-hybridising probesets, in order to reduce the false positive rate. The LIMMA approach ranked genes containing single or few differentially spliced exons much higher than genes containing several differentially spliced exons. On the other hand we found the gene correlation coefficient approach better for identifying genes with a large number of differentially spliced exons. CONCLUSION: We show that LIMMA can be used to identify differential exon splicing from Affymetrix exon array data. Though further work would be necessary to develop the use of correlation coefficients into a complete analysis approach, the preliminary results demonstrate their usefulness for identifying differentially spliced genes. The two approaches work complementary as they can potentially identify different subsets of genes (single/few spliced exons vs. large transcript structure differences).
format Text
id pubmed-2636774
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26367742009-02-06 Identifying differential exon splicing using linear models and correlation coefficients Shah, Sonia H Pallas, Jacqueline A BMC Bioinformatics Methodology Article BACKGROUND: With the availability of the Affymetrix exon arrays a number of tools have been developed to enable the analysis. These however can be expensive or have several pre-installation requirements. This led us to develop an analysis workflow for analysing differential splicing using freely available software packages that are already being widely used for gene expression analysis. The workflow uses the packages in the standard installation of R and Bioconductor (BiocLite) to identify differential splicing. We use the splice index method with the LIMMA framework. The main drawback with this approach is that it relies on accurate estimates of gene expression from the probe-level data. Methods such as RMA and PLIER may misestimate when a large proportion of exons are spliced. We therefore present the novel concept of a gene correlation coefficient calculated using only the probeset expression pattern within a gene. We show that genes with lower correlation coefficients are likely to be differentially spliced. RESULTS: The LIMMA approach was used to identify several tissue-specific transcripts and splicing events that are supported by previous experimental studies. Filtering the data is necessary, particularly removing exons and genes that are not expressed in all samples and cross-hybridising probesets, in order to reduce the false positive rate. The LIMMA approach ranked genes containing single or few differentially spliced exons much higher than genes containing several differentially spliced exons. On the other hand we found the gene correlation coefficient approach better for identifying genes with a large number of differentially spliced exons. CONCLUSION: We show that LIMMA can be used to identify differential exon splicing from Affymetrix exon array data. Though further work would be necessary to develop the use of correlation coefficients into a complete analysis approach, the preliminary results demonstrate their usefulness for identifying differentially spliced genes. The two approaches work complementary as they can potentially identify different subsets of genes (single/few spliced exons vs. large transcript structure differences). BioMed Central 2009-01-20 /pmc/articles/PMC2636774/ /pubmed/19154578 http://dx.doi.org/10.1186/1471-2105-10-26 Text en Copyright © 2009 Shah and Pallas; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Shah, Sonia H
Pallas, Jacqueline A
Identifying differential exon splicing using linear models and correlation coefficients
title Identifying differential exon splicing using linear models and correlation coefficients
title_full Identifying differential exon splicing using linear models and correlation coefficients
title_fullStr Identifying differential exon splicing using linear models and correlation coefficients
title_full_unstemmed Identifying differential exon splicing using linear models and correlation coefficients
title_short Identifying differential exon splicing using linear models and correlation coefficients
title_sort identifying differential exon splicing using linear models and correlation coefficients
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2636774/
https://www.ncbi.nlm.nih.gov/pubmed/19154578
http://dx.doi.org/10.1186/1471-2105-10-26
work_keys_str_mv AT shahsoniah identifyingdifferentialexonsplicingusinglinearmodelsandcorrelationcoefficients
AT pallasjacquelinea identifyingdifferentialexonsplicingusinglinearmodelsandcorrelationcoefficients