Cargando…
Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data
BACKGROUND: Most ‘transcriptomic’ data from microarrays are generated from small sample sizes compared to the large number of measured biomarkers, making it very difficult to build accurate and generalizable disease state classification models. Integrating information from different, but related, ‘t...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4512094/ https://www.ncbi.nlm.nih.gov/pubmed/26202217 http://dx.doi.org/10.1186/s12859-015-0643-8 |
_version_ | 1782382444742180864 |
---|---|
author | Ogoe, Henry A. Visweswaran, Shyam Lu, Xinghua Gopalakrishnan, Vanathi |
author_facet | Ogoe, Henry A. Visweswaran, Shyam Lu, Xinghua Gopalakrishnan, Vanathi |
author_sort | Ogoe, Henry A. |
collection | PubMed |
description | BACKGROUND: Most ‘transcriptomic’ data from microarrays are generated from small sample sizes compared to the large number of measured biomarkers, making it very difficult to build accurate and generalizable disease state classification models. Integrating information from different, but related, ‘transcriptomic’ data may help build better classification models. However, most proposed methods for integrative analysis of ‘transcriptomic’ data cannot incorporate domain knowledge, which can improve model performance. To this end, we have developed a methodology that leverages transfer rule learning and functional modules, which we call TRL-FM, to capture and abstract domain knowledge in the form of classification rules to facilitate integrative modeling of multiple gene expression data. TRL-FM is an extension of the transfer rule learner (TRL) that we developed previously. The goal of this study was to test our hypothesis that “an integrative model obtained via the TRL-FM approach outperforms traditional models based on single gene expression data sources”. RESULTS: To evaluate the feasibility of the TRL-FM framework, we compared the area under the ROC curve (AUC) of models developed with TRL-FM and other traditional methods, using 21 microarray datasets generated from three studies on brain cancer, prostate cancer, and lung disease, respectively. The results show that TRL-FM statistically significantly outperforms TRL as well as traditional models based on single source data. In addition, TRL-FM performed better than other integrative models driven by meta-analysis and cross-platform data merging. CONCLUSIONS: The capability of utilizing transferred abstract knowledge derived from source data using feature mapping enables the TRL-FM framework to mimic the human process of learning and adaptation when performing related tasks. The novel TRL-FM methodology for integrative modeling for multiple ‘transcriptomic’ datasets is able to intelligently incorporate domain knowledge that traditional methods might disregard, to boost predictive power and generalization performance. In this study, TRL-FM’s abstraction of knowledge is achieved in the form of functional modules, but the overall framework is generalizable in that different approaches of acquiring abstract knowledge can be integrated into this framework. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0643-8) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4512094 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45120942015-07-24 Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data Ogoe, Henry A. Visweswaran, Shyam Lu, Xinghua Gopalakrishnan, Vanathi BMC Bioinformatics Methodology Article BACKGROUND: Most ‘transcriptomic’ data from microarrays are generated from small sample sizes compared to the large number of measured biomarkers, making it very difficult to build accurate and generalizable disease state classification models. Integrating information from different, but related, ‘transcriptomic’ data may help build better classification models. However, most proposed methods for integrative analysis of ‘transcriptomic’ data cannot incorporate domain knowledge, which can improve model performance. To this end, we have developed a methodology that leverages transfer rule learning and functional modules, which we call TRL-FM, to capture and abstract domain knowledge in the form of classification rules to facilitate integrative modeling of multiple gene expression data. TRL-FM is an extension of the transfer rule learner (TRL) that we developed previously. The goal of this study was to test our hypothesis that “an integrative model obtained via the TRL-FM approach outperforms traditional models based on single gene expression data sources”. RESULTS: To evaluate the feasibility of the TRL-FM framework, we compared the area under the ROC curve (AUC) of models developed with TRL-FM and other traditional methods, using 21 microarray datasets generated from three studies on brain cancer, prostate cancer, and lung disease, respectively. The results show that TRL-FM statistically significantly outperforms TRL as well as traditional models based on single source data. In addition, TRL-FM performed better than other integrative models driven by meta-analysis and cross-platform data merging. CONCLUSIONS: The capability of utilizing transferred abstract knowledge derived from source data using feature mapping enables the TRL-FM framework to mimic the human process of learning and adaptation when performing related tasks. The novel TRL-FM methodology for integrative modeling for multiple ‘transcriptomic’ datasets is able to intelligently incorporate domain knowledge that traditional methods might disregard, to boost predictive power and generalization performance. In this study, TRL-FM’s abstraction of knowledge is achieved in the form of functional modules, but the overall framework is generalizable in that different approaches of acquiring abstract knowledge can be integrated into this framework. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0643-8) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-23 /pmc/articles/PMC4512094/ /pubmed/26202217 http://dx.doi.org/10.1186/s12859-015-0643-8 Text en © Ogoe et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Ogoe, Henry A. Visweswaran, Shyam Lu, Xinghua Gopalakrishnan, Vanathi Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data |
title | Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data |
title_full | Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data |
title_fullStr | Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data |
title_full_unstemmed | Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data |
title_short | Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data |
title_sort | knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4512094/ https://www.ncbi.nlm.nih.gov/pubmed/26202217 http://dx.doi.org/10.1186/s12859-015-0643-8 |
work_keys_str_mv | AT ogoehenrya knowledgetransferviaclassificationrulesusingfunctionalmappingforintegrativemodelingofgeneexpressiondata AT visweswaranshyam knowledgetransferviaclassificationrulesusingfunctionalmappingforintegrativemodelingofgeneexpressiondata AT luxinghua knowledgetransferviaclassificationrulesusingfunctionalmappingforintegrativemodelingofgeneexpressiondata AT gopalakrishnanvanathi knowledgetransferviaclassificationrulesusingfunctionalmappingforintegrativemodelingofgeneexpressiondata |