Cargando…

Multiple-platform data integration method with application to combined analysis of microarray and proteomic data

BACKGROUND: It is desirable in genomic studies to select biomarkers that differentiate between normal and diseased populations based on related data sets from different platforms, including microarray expression and proteomic data. Most recently developed integration methods focus on correlation ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Shicheng, Xu, Yawen, Feng, Zeny, Yang, Xiaojian, Wang, Xiaogang, Gao, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3770449/
https://www.ncbi.nlm.nih.gov/pubmed/23198695
http://dx.doi.org/10.1186/1471-2105-13-320
_version_ 1782284091828207616
author Wu, Shicheng
Xu, Yawen
Feng, Zeny
Yang, Xiaojian
Wang, Xiaogang
Gao, Xin
author_facet Wu, Shicheng
Xu, Yawen
Feng, Zeny
Yang, Xiaojian
Wang, Xiaogang
Gao, Xin
author_sort Wu, Shicheng
collection PubMed
description BACKGROUND: It is desirable in genomic studies to select biomarkers that differentiate between normal and diseased populations based on related data sets from different platforms, including microarray expression and proteomic data. Most recently developed integration methods focus on correlation analyses between gene and protein expression profiles. The correlation methods select biomarkers with concordant behavior across two platforms but do not directly select differentially expressed biomarkers. Other integration methods have been proposed to combine statistical evidence in terms of ranks and p-values, but they do not account for the dependency relationships among the data across platforms. RESULTS: In this paper, we propose an integration method to perform hypothesis testing and biomarkers selection based on multi-platform data sets observed from normal and diseased populations. The types of test statistics can vary across the platforms and their marginal distributions can be different. The observed test statistics are aggregated across different data platforms in a weighted scheme, where the weights take into account different variabilities possessed by test statistics. The overall decision is based on the empirical distribution of the aggregated statistic obtained through random permutations. CONCLUSION: In both simulation studies and real biological data analyses, our proposed method of multi-platform integration has better control over false discovery rates and higher positive selection rates than the uncombined method. The proposed method is also shown to be more powerful than rank aggregation method.
format Online
Article
Text
id pubmed-3770449
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37704492013-09-12 Multiple-platform data integration method with application to combined analysis of microarray and proteomic data Wu, Shicheng Xu, Yawen Feng, Zeny Yang, Xiaojian Wang, Xiaogang Gao, Xin BMC Bioinformatics Methodology Article BACKGROUND: It is desirable in genomic studies to select biomarkers that differentiate between normal and diseased populations based on related data sets from different platforms, including microarray expression and proteomic data. Most recently developed integration methods focus on correlation analyses between gene and protein expression profiles. The correlation methods select biomarkers with concordant behavior across two platforms but do not directly select differentially expressed biomarkers. Other integration methods have been proposed to combine statistical evidence in terms of ranks and p-values, but they do not account for the dependency relationships among the data across platforms. RESULTS: In this paper, we propose an integration method to perform hypothesis testing and biomarkers selection based on multi-platform data sets observed from normal and diseased populations. The types of test statistics can vary across the platforms and their marginal distributions can be different. The observed test statistics are aggregated across different data platforms in a weighted scheme, where the weights take into account different variabilities possessed by test statistics. The overall decision is based on the empirical distribution of the aggregated statistic obtained through random permutations. CONCLUSION: In both simulation studies and real biological data analyses, our proposed method of multi-platform integration has better control over false discovery rates and higher positive selection rates than the uncombined method. The proposed method is also shown to be more powerful than rank aggregation method. BioMed Central 2012-12-02 /pmc/articles/PMC3770449/ /pubmed/23198695 http://dx.doi.org/10.1186/1471-2105-13-320 Text en Copyright © 2012 Wu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Wu, Shicheng
Xu, Yawen
Feng, Zeny
Yang, Xiaojian
Wang, Xiaogang
Gao, Xin
Multiple-platform data integration method with application to combined analysis of microarray and proteomic data
title Multiple-platform data integration method with application to combined analysis of microarray and proteomic data
title_full Multiple-platform data integration method with application to combined analysis of microarray and proteomic data
title_fullStr Multiple-platform data integration method with application to combined analysis of microarray and proteomic data
title_full_unstemmed Multiple-platform data integration method with application to combined analysis of microarray and proteomic data
title_short Multiple-platform data integration method with application to combined analysis of microarray and proteomic data
title_sort multiple-platform data integration method with application to combined analysis of microarray and proteomic data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3770449/
https://www.ncbi.nlm.nih.gov/pubmed/23198695
http://dx.doi.org/10.1186/1471-2105-13-320
work_keys_str_mv AT wushicheng multipleplatformdataintegrationmethodwithapplicationtocombinedanalysisofmicroarrayandproteomicdata
AT xuyawen multipleplatformdataintegrationmethodwithapplicationtocombinedanalysisofmicroarrayandproteomicdata
AT fengzeny multipleplatformdataintegrationmethodwithapplicationtocombinedanalysisofmicroarrayandproteomicdata
AT yangxiaojian multipleplatformdataintegrationmethodwithapplicationtocombinedanalysisofmicroarrayandproteomicdata
AT wangxiaogang multipleplatformdataintegrationmethodwithapplicationtocombinedanalysisofmicroarrayandproteomicdata
AT gaoxin multipleplatformdataintegrationmethodwithapplicationtocombinedanalysisofmicroarrayandproteomicdata