Cargando…

Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data

BACKGROUND: Most ‘transcriptomic’ data from microarrays are generated from small sample sizes compared to the large number of measured biomarkers, making it very difficult to build accurate and generalizable disease state classification models. Integrating information from different, but related, ‘t...

Descripción completa

Detalles Bibliográficos
Autores principales: Ogoe, Henry A., Visweswaran, Shyam, Lu, Xinghua, Gopalakrishnan, Vanathi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4512094/
https://www.ncbi.nlm.nih.gov/pubmed/26202217
http://dx.doi.org/10.1186/s12859-015-0643-8
_version_ 1782382444742180864
author Ogoe, Henry A.
Visweswaran, Shyam
Lu, Xinghua
Gopalakrishnan, Vanathi
author_facet Ogoe, Henry A.
Visweswaran, Shyam
Lu, Xinghua
Gopalakrishnan, Vanathi
author_sort Ogoe, Henry A.
collection PubMed
description BACKGROUND: Most ‘transcriptomic’ data from microarrays are generated from small sample sizes compared to the large number of measured biomarkers, making it very difficult to build accurate and generalizable disease state classification models. Integrating information from different, but related, ‘transcriptomic’ data may help build better classification models. However, most proposed methods for integrative analysis of ‘transcriptomic’ data cannot incorporate domain knowledge, which can improve model performance. To this end, we have developed a methodology that leverages transfer rule learning and functional modules, which we call TRL-FM, to capture and abstract domain knowledge in the form of classification rules to facilitate integrative modeling of multiple gene expression data. TRL-FM is an extension of the transfer rule learner (TRL) that we developed previously. The goal of this study was to test our hypothesis that “an integrative model obtained via the TRL-FM approach outperforms traditional models based on single gene expression data sources”. RESULTS: To evaluate the feasibility of the TRL-FM framework, we compared the area under the ROC curve (AUC) of models developed with TRL-FM and other traditional methods, using 21 microarray datasets generated from three studies on brain cancer, prostate cancer, and lung disease, respectively. The results show that TRL-FM statistically significantly outperforms TRL as well as traditional models based on single source data. In addition, TRL-FM performed better than other integrative models driven by meta-analysis and cross-platform data merging. CONCLUSIONS: The capability of utilizing transferred abstract knowledge derived from source data using feature mapping enables the TRL-FM framework to mimic the human process of learning and adaptation when performing related tasks. The novel TRL-FM methodology for integrative modeling for multiple ‘transcriptomic’ datasets is able to intelligently incorporate domain knowledge that traditional methods might disregard, to boost predictive power and generalization performance. In this study, TRL-FM’s abstraction of knowledge is achieved in the form of functional modules, but the overall framework is generalizable in that different approaches of acquiring abstract knowledge can be integrated into this framework. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0643-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4512094
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45120942015-07-24 Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data Ogoe, Henry A. Visweswaran, Shyam Lu, Xinghua Gopalakrishnan, Vanathi BMC Bioinformatics Methodology Article BACKGROUND: Most ‘transcriptomic’ data from microarrays are generated from small sample sizes compared to the large number of measured biomarkers, making it very difficult to build accurate and generalizable disease state classification models. Integrating information from different, but related, ‘transcriptomic’ data may help build better classification models. However, most proposed methods for integrative analysis of ‘transcriptomic’ data cannot incorporate domain knowledge, which can improve model performance. To this end, we have developed a methodology that leverages transfer rule learning and functional modules, which we call TRL-FM, to capture and abstract domain knowledge in the form of classification rules to facilitate integrative modeling of multiple gene expression data. TRL-FM is an extension of the transfer rule learner (TRL) that we developed previously. The goal of this study was to test our hypothesis that “an integrative model obtained via the TRL-FM approach outperforms traditional models based on single gene expression data sources”. RESULTS: To evaluate the feasibility of the TRL-FM framework, we compared the area under the ROC curve (AUC) of models developed with TRL-FM and other traditional methods, using 21 microarray datasets generated from three studies on brain cancer, prostate cancer, and lung disease, respectively. The results show that TRL-FM statistically significantly outperforms TRL as well as traditional models based on single source data. In addition, TRL-FM performed better than other integrative models driven by meta-analysis and cross-platform data merging. CONCLUSIONS: The capability of utilizing transferred abstract knowledge derived from source data using feature mapping enables the TRL-FM framework to mimic the human process of learning and adaptation when performing related tasks. The novel TRL-FM methodology for integrative modeling for multiple ‘transcriptomic’ datasets is able to intelligently incorporate domain knowledge that traditional methods might disregard, to boost predictive power and generalization performance. In this study, TRL-FM’s abstraction of knowledge is achieved in the form of functional modules, but the overall framework is generalizable in that different approaches of acquiring abstract knowledge can be integrated into this framework. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0643-8) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-23 /pmc/articles/PMC4512094/ /pubmed/26202217 http://dx.doi.org/10.1186/s12859-015-0643-8 Text en © Ogoe et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Ogoe, Henry A.
Visweswaran, Shyam
Lu, Xinghua
Gopalakrishnan, Vanathi
Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data
title Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data
title_full Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data
title_fullStr Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data
title_full_unstemmed Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data
title_short Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data
title_sort knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4512094/
https://www.ncbi.nlm.nih.gov/pubmed/26202217
http://dx.doi.org/10.1186/s12859-015-0643-8
work_keys_str_mv AT ogoehenrya knowledgetransferviaclassificationrulesusingfunctionalmappingforintegrativemodelingofgeneexpressiondata
AT visweswaranshyam knowledgetransferviaclassificationrulesusingfunctionalmappingforintegrativemodelingofgeneexpressiondata
AT luxinghua knowledgetransferviaclassificationrulesusingfunctionalmappingforintegrativemodelingofgeneexpressiondata
AT gopalakrishnanvanathi knowledgetransferviaclassificationrulesusingfunctionalmappingforintegrativemodelingofgeneexpressiondata