Cargando…
An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era
BACKGROUND: Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technol...
Autores principales: | , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290828/ https://www.ncbi.nlm.nih.gov/pubmed/25633159 http://dx.doi.org/10.1186/s13059-014-0523-y |
_version_ | 1782352310405431296 |
---|---|
author | Su, Zhenqiang Fang, Hong Hong, Huixiao Shi, Leming Zhang, Wenqian Zhang, Wenwei Zhang, Yanyan Dong, Zirui Lancashire, Lee J Bessarabova, Marina Yang, Xi Ning, Baitang Gong, Binsheng Meehan, Joe Xu, Joshua Ge, Weigong Perkins, Roger Fischer, Matthias Tong, Weida |
author_facet | Su, Zhenqiang Fang, Hong Hong, Huixiao Shi, Leming Zhang, Wenqian Zhang, Wenwei Zhang, Yanyan Dong, Zirui Lancashire, Lee J Bessarabova, Marina Yang, Xi Ning, Baitang Gong, Binsheng Meehan, Joe Xu, Joshua Ge, Weigong Perkins, Roger Fischer, Matthias Tong, Weida |
author_sort | Su, Zhenqiang |
collection | PubMed |
description | BACKGROUND: Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technologies co-exist. This raises two important questions: Can microarray-based models and biomarkers be directly applied to RNA-seq data? Can future RNA-seq-based predictive models and biomarkers be applied to microarray data to leverage past investment? RESULTS: We systematically evaluated the transferability of predictive models and signature genes between microarray and RNA-seq using two large clinical data sets. The complexity of cross-platform sequence correspondence was considered in the analysis and examined using three human and two rat data sets, and three levels of mapping complexity were revealed. Three algorithms representing different modeling complexity were applied to the three levels of mappings for each of the eight binary endpoints and Cox regression was used to model survival times with expression data. In total, 240,096 predictive models were examined. CONCLUSIONS: Signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development, and microarray-based models can accurately predict RNA-seq-profiled samples; while RNA-seq-based models are less accurate in predicting microarray-profiled samples and are affected both by the choice of modeling algorithm and the gene mapping complexity. The results suggest continued usefulness of legacy microarray data and established microarray biomarkers and predictive models in the forthcoming RNA-seq era. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0523-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4290828 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42908282015-01-28 An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era Su, Zhenqiang Fang, Hong Hong, Huixiao Shi, Leming Zhang, Wenqian Zhang, Wenwei Zhang, Yanyan Dong, Zirui Lancashire, Lee J Bessarabova, Marina Yang, Xi Ning, Baitang Gong, Binsheng Meehan, Joe Xu, Joshua Ge, Weigong Perkins, Roger Fischer, Matthias Tong, Weida Genome Biol Research BACKGROUND: Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technologies co-exist. This raises two important questions: Can microarray-based models and biomarkers be directly applied to RNA-seq data? Can future RNA-seq-based predictive models and biomarkers be applied to microarray data to leverage past investment? RESULTS: We systematically evaluated the transferability of predictive models and signature genes between microarray and RNA-seq using two large clinical data sets. The complexity of cross-platform sequence correspondence was considered in the analysis and examined using three human and two rat data sets, and three levels of mapping complexity were revealed. Three algorithms representing different modeling complexity were applied to the three levels of mappings for each of the eight binary endpoints and Cox regression was used to model survival times with expression data. In total, 240,096 predictive models were examined. CONCLUSIONS: Signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development, and microarray-based models can accurately predict RNA-seq-profiled samples; while RNA-seq-based models are less accurate in predicting microarray-profiled samples and are affected both by the choice of modeling algorithm and the gene mapping complexity. The results suggest continued usefulness of legacy microarray data and established microarray biomarkers and predictive models in the forthcoming RNA-seq era. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0523-y) contains supplementary material, which is available to authorized users. BioMed Central 2014-12-03 2014 /pmc/articles/PMC4290828/ /pubmed/25633159 http://dx.doi.org/10.1186/s13059-014-0523-y Text en © Su et al.; licensee BioMed Central. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Su, Zhenqiang Fang, Hong Hong, Huixiao Shi, Leming Zhang, Wenqian Zhang, Wenwei Zhang, Yanyan Dong, Zirui Lancashire, Lee J Bessarabova, Marina Yang, Xi Ning, Baitang Gong, Binsheng Meehan, Joe Xu, Joshua Ge, Weigong Perkins, Roger Fischer, Matthias Tong, Weida An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era |
title | An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era |
title_full | An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era |
title_fullStr | An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era |
title_full_unstemmed | An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era |
title_short | An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era |
title_sort | investigation of biomarkers derived from legacy microarray data for their utility in the rna-seq era |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290828/ https://www.ncbi.nlm.nih.gov/pubmed/25633159 http://dx.doi.org/10.1186/s13059-014-0523-y |
work_keys_str_mv | AT suzhenqiang aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT fanghong aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT honghuixiao aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT shileming aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT zhangwenqian aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT zhangwenwei aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT zhangyanyan aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT dongzirui aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT lancashireleej aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT bessarabovamarina aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT yangxi aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT ningbaitang aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT gongbinsheng aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT meehanjoe aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT xujoshua aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT geweigong aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT perkinsroger aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT fischermatthias aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT tongweida aninvestigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT suzhenqiang investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT fanghong investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT honghuixiao investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT shileming investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT zhangwenqian investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT zhangwenwei investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT zhangyanyan investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT dongzirui investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT lancashireleej investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT bessarabovamarina investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT yangxi investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT ningbaitang investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT gongbinsheng investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT meehanjoe investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT xujoshua investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT geweigong investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT perkinsroger investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT fischermatthias investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera AT tongweida investigationofbiomarkersderivedfromlegacymicroarraydatafortheirutilityinthernaseqera |