Cargando…

An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era

BACKGROUND: Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technol...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Zhenqiang, Fang, Hong, Hong, Huixiao, Shi, Leming, Zhang, Wenqian, Zhang, Wenwei, Zhang, Yanyan, Dong, Zirui, Lancashire, Lee J, Bessarabova, Marina, Yang, Xi, Ning, Baitang, Gong, Binsheng, Meehan, Joe, Xu, Joshua, Ge, Weigong, Perkins, Roger, Fischer, Matthias, Tong, Weida
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290828/
https://www.ncbi.nlm.nih.gov/pubmed/25633159
http://dx.doi.org/10.1186/s13059-014-0523-y
_version_ 1782352310405431296
author Su, Zhenqiang
Fang, Hong
Hong, Huixiao
Shi, Leming
Zhang, Wenqian
Zhang, Wenwei
Zhang, Yanyan
Dong, Zirui
Lancashire, Lee J
Bessarabova, Marina
Yang, Xi
Ning, Baitang
Gong, Binsheng
Meehan, Joe
Xu, Joshua
Ge, Weigong
Perkins, Roger
Fischer, Matthias
Tong, Weida
author_facet Su, Zhenqiang
Fang, Hong
Hong, Huixiao
Shi, Leming
Zhang, Wenqian
Zhang, Wenwei
Zhang, Yanyan
Dong, Zirui
Lancashire, Lee J
Bessarabova, Marina
Yang, Xi
Ning, Baitang
Gong, Binsheng
Meehan, Joe
Xu, Joshua
Ge, Weigong
Perkins, Roger
Fischer, Matthias
Tong, Weida
author_sort Su, Zhenqiang
collection PubMed
description BACKGROUND: Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technologies co-exist. This raises two important questions: Can microarray-based models and biomarkers be directly applied to RNA-seq data? Can future RNA-seq-based predictive models and biomarkers be applied to microarray data to leverage past investment? RESULTS: We systematically evaluated the transferability of predictive models and signature genes between microarray and RNA-seq using two large clinical data sets. The complexity of cross-platform sequence correspondence was considered in the analysis and examined using three human and two rat data sets, and three levels of mapping complexity were revealed. Three algorithms representing different modeling complexity were applied to the three levels of mappings for each of the eight binary endpoints and Cox regression was used to model survival times with expression data. In total, 240,096 predictive models were examined. CONCLUSIONS: Signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development, and microarray-based models can accurately predict RNA-seq-profiled samples; while RNA-seq-based models are less accurate in predicting microarray-profiled samples and are affected both by the choice of modeling algorithm and the gene mapping complexity. The results suggest continued usefulness of legacy microarray data and established microarray biomarkers and predictive models in the forthcoming RNA-seq era. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0523-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4290828
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42908282015-01-28 An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era Su, Zhenqiang Fang, Hong Hong, Huixiao Shi, Leming Zhang, Wenqian Zhang, Wenwei Zhang, Yanyan Dong, Zirui Lancashire, Lee J Bessarabova, Marina Yang, Xi Ning, Baitang Gong, Binsheng Meehan, Joe Xu, Joshua Ge, Weigong Perkins, Roger Fischer, Matthias Tong, Weida Genome Biol Research BACKGROUND: Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technologies co-exist. This raises two important questions: Can microarray-based models and biomarkers be directly applied to RNA-seq data? Can future RNA-seq-based predictive models and biomarkers be applied to microarray data to leverage past investment? RESULTS: We systematically evaluated the transferability of predictive models and signature genes between microarray and RNA-seq using two large clinical data sets. The complexity of cross-platform sequence correspondence was considered in the analysis and examined using three human and two rat data sets, and three levels of mapping complexity were revealed. Three algorithms representing different modeling complexity were applied to the three levels of mappings for each of the eight binary endpoints and Cox regression was used to model survival times with expression data. In total, 240,096 predictive models were examined. CONCLUSIONS: Signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development, and microarray-based models can accurately predict RNA-seq-profiled samples; while RNA-seq-based models are less accurate in predicting microarray-profiled samples and are affected both by the choice of modeling algorithm and the gene mapping complexity. The results suggest continued usefulness of legacy microarray data and established microarray biomarkers and predictive models in the forthcoming RNA-seq era. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0523-y) contains supplementary material, which is available to authorized users. BioMed Central 2014-12-03 2014 /pmc/articles/PMC4290828/ /pubmed/25633159 http://dx.doi.org/10.1186/s13059-014-0523-y Text en © Su et al.; licensee BioMed Central. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Su, Zhenqiang
Fang, Hong
Hong, Huixiao
Shi, Leming
Zhang, Wenqian
Zhang, Wenwei
Zhang, Yanyan
Dong, Zirui
Lancashire, Lee J
Bessarabova, Marina
Yang, Xi
Ning, Baitang
Gong, Binsheng
Meehan, Joe
Xu, Joshua
Ge, Weigong
Perkins, Roger
Fischer, Matthias
Tong, Weida
An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era
title An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era
title_full An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era
title_fullStr An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era
title_full_unstemmed An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era
title_short An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era
title_sort investigation of biomarkers derived from legacy microarray data for their utility in the rna-seq era
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290828/
https://www.ncbi.nlm.nih.gov/pubmed/25633159
http://dx.doi.org/10.1186/s13059-014-0523-y
work_keys_str_mv AT suzhenqiang aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT fanghong aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT honghuixiao aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT shileming aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT zhangwenqian aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT zhangwenwei aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT zhangyanyan aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT dongzirui aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT lancashireleej aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT bessarabovamarina aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT yangxi aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT ningbaitang aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT gongbinsheng aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT meehanjoe aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT xujoshua aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT geweigong aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT perkinsroger aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT fischermatthias aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT tongweida aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT suzhenqiang investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT fanghong investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT honghuixiao investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT shileming investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT zhangwenqian investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT zhangwenwei investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT zhangyanyan investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT dongzirui investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT lancashireleej investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT bessarabovamarina investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT yangxi investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT ningbaitang investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT gongbinsheng investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT meehanjoe investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT xujoshua investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT geweigong investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT perkinsroger investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT fischermatthias investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera
AT tongweida investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera