Cargando…
A semi-parametric statistical model for integrating gene expression profiles across different platforms
BACKGROUND: Determining differentially expressed genes (DEGs) between biological samples is the key to understand how genotype gives rise to phenotype. RNA-seq and microarray are two main technologies for profiling gene expression levels. However, considerable discrepancy has been found between DEGs...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895261/ https://www.ncbi.nlm.nih.gov/pubmed/26818110 http://dx.doi.org/10.1186/s12859-015-0847-y |
_version_ | 1782435813435375616 |
---|---|
author | Lyu, Yafei Li, Qunhua |
author_facet | Lyu, Yafei Li, Qunhua |
author_sort | Lyu, Yafei |
collection | PubMed |
description | BACKGROUND: Determining differentially expressed genes (DEGs) between biological samples is the key to understand how genotype gives rise to phenotype. RNA-seq and microarray are two main technologies for profiling gene expression levels. However, considerable discrepancy has been found between DEGs detected using the two technologies. Integration data across these two platforms has the potential to improve the power and reliability of DEG detection. METHODS: We propose a rank-based semi-parametric model to determine DEGs using information across different sources and apply it to the integration of RNA-seq and microarray data. By incorporating both the significance of differential expression and the consistency across platforms, our method effectively detects DEGs with moderate but consistent signals. We demonstrate the effectiveness of our method using simulation studies, MAQC/SEQC data and a synthetic microRNA dataset. CONCLUSIONS: Our integration method is not only robust to noise and heterogeneity in the data, but also adaptive to the structure of data. In our simulations and real data studies, our approach shows a higher discriminate power and identifies more biologically relevant DEGs than eBayes, DEseq and some commonly used meta-analysis methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0847-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4895261 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-48952612016-06-10 A semi-parametric statistical model for integrating gene expression profiles across different platforms Lyu, Yafei Li, Qunhua BMC Bioinformatics Proceedings BACKGROUND: Determining differentially expressed genes (DEGs) between biological samples is the key to understand how genotype gives rise to phenotype. RNA-seq and microarray are two main technologies for profiling gene expression levels. However, considerable discrepancy has been found between DEGs detected using the two technologies. Integration data across these two platforms has the potential to improve the power and reliability of DEG detection. METHODS: We propose a rank-based semi-parametric model to determine DEGs using information across different sources and apply it to the integration of RNA-seq and microarray data. By incorporating both the significance of differential expression and the consistency across platforms, our method effectively detects DEGs with moderate but consistent signals. We demonstrate the effectiveness of our method using simulation studies, MAQC/SEQC data and a synthetic microRNA dataset. CONCLUSIONS: Our integration method is not only robust to noise and heterogeneity in the data, but also adaptive to the structure of data. In our simulations and real data studies, our approach shows a higher discriminate power and identifies more biologically relevant DEGs than eBayes, DEseq and some commonly used meta-analysis methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0847-y) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-11 /pmc/articles/PMC4895261/ /pubmed/26818110 http://dx.doi.org/10.1186/s12859-015-0847-y Text en © Lyu and Li. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings Lyu, Yafei Li, Qunhua A semi-parametric statistical model for integrating gene expression profiles across different platforms |
title | A semi-parametric statistical model for integrating gene expression profiles across different platforms |
title_full | A semi-parametric statistical model for integrating gene expression profiles across different platforms |
title_fullStr | A semi-parametric statistical model for integrating gene expression profiles across different platforms |
title_full_unstemmed | A semi-parametric statistical model for integrating gene expression profiles across different platforms |
title_short | A semi-parametric statistical model for integrating gene expression profiles across different platforms |
title_sort | semi-parametric statistical model for integrating gene expression profiles across different platforms |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895261/ https://www.ncbi.nlm.nih.gov/pubmed/26818110 http://dx.doi.org/10.1186/s12859-015-0847-y |
work_keys_str_mv | AT lyuyafei asemiparametricstatisticalmodelforintegratinggeneexpressionprofilesacrossdifferentplatforms AT liqunhua asemiparametricstatisticalmodelforintegratinggeneexpressionprofilesacrossdifferentplatforms AT lyuyafei semiparametricstatisticalmodelforintegratinggeneexpressionprofilesacrossdifferentplatforms AT liqunhua semiparametricstatisticalmodelforintegratinggeneexpressionprofilesacrossdifferentplatforms |