Cargando…
Knowledge-based variable selection for learning rules from proteomic data
BACKGROUND: The incorporation of biological knowledge can enhance the analysis of biomedical data. We present a novel method that uses a proteomic knowledge base to enhance the performance of a rule-learning algorithm in identifying putative biomarkers of disease from high-dimensional proteomic mass...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2745687/ https://www.ncbi.nlm.nih.gov/pubmed/19761570 http://dx.doi.org/10.1186/1471-2105-10-S9-S16 |
_version_ | 1782171986558976000 |
---|---|
author | Lustgarten, Jonathan L Visweswaran, Shyam Bowser, Robert P Hogan, William R Gopalakrishnan, Vanathi |
author_facet | Lustgarten, Jonathan L Visweswaran, Shyam Bowser, Robert P Hogan, William R Gopalakrishnan, Vanathi |
author_sort | Lustgarten, Jonathan L |
collection | PubMed |
description | BACKGROUND: The incorporation of biological knowledge can enhance the analysis of biomedical data. We present a novel method that uses a proteomic knowledge base to enhance the performance of a rule-learning algorithm in identifying putative biomarkers of disease from high-dimensional proteomic mass spectral data. In particular, we use the Empirical Proteomics Ontology Knowledge Base (EPO-KB) that contains previously identified and validated proteomic biomarkers to select m/zs in a proteomic dataset prior to analysis to increase performance. RESULTS: We show that using EPO-KB as a pre-processing method, specifically selecting all biomarkers found only in the biofluid of the proteomic dataset, reduces the dimensionality by 95% and provides a statistically significantly greater increase in performance over no variable selection and random variable selection. CONCLUSION: Knowledge-based variable selection even with a sparsely-populated resource such as the EPO-KB increases overall performance of rule-learning for disease classification from high-dimensional proteomic mass spectra. |
format | Text |
id | pubmed-2745687 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27456872009-09-18 Knowledge-based variable selection for learning rules from proteomic data Lustgarten, Jonathan L Visweswaran, Shyam Bowser, Robert P Hogan, William R Gopalakrishnan, Vanathi BMC Bioinformatics Proceedings BACKGROUND: The incorporation of biological knowledge can enhance the analysis of biomedical data. We present a novel method that uses a proteomic knowledge base to enhance the performance of a rule-learning algorithm in identifying putative biomarkers of disease from high-dimensional proteomic mass spectral data. In particular, we use the Empirical Proteomics Ontology Knowledge Base (EPO-KB) that contains previously identified and validated proteomic biomarkers to select m/zs in a proteomic dataset prior to analysis to increase performance. RESULTS: We show that using EPO-KB as a pre-processing method, specifically selecting all biomarkers found only in the biofluid of the proteomic dataset, reduces the dimensionality by 95% and provides a statistically significantly greater increase in performance over no variable selection and random variable selection. CONCLUSION: Knowledge-based variable selection even with a sparsely-populated resource such as the EPO-KB increases overall performance of rule-learning for disease classification from high-dimensional proteomic mass spectra. BioMed Central 2009-09-17 /pmc/articles/PMC2745687/ /pubmed/19761570 http://dx.doi.org/10.1186/1471-2105-10-S9-S16 Text en Copyright © 2009 Lustgarten et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Lustgarten, Jonathan L Visweswaran, Shyam Bowser, Robert P Hogan, William R Gopalakrishnan, Vanathi Knowledge-based variable selection for learning rules from proteomic data |
title | Knowledge-based variable selection for learning rules from proteomic data |
title_full | Knowledge-based variable selection for learning rules from proteomic data |
title_fullStr | Knowledge-based variable selection for learning rules from proteomic data |
title_full_unstemmed | Knowledge-based variable selection for learning rules from proteomic data |
title_short | Knowledge-based variable selection for learning rules from proteomic data |
title_sort | knowledge-based variable selection for learning rules from proteomic data |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2745687/ https://www.ncbi.nlm.nih.gov/pubmed/19761570 http://dx.doi.org/10.1186/1471-2105-10-S9-S16 |
work_keys_str_mv | AT lustgartenjonathanl knowledgebasedvariableselectionforlearningrulesfromproteomicdata AT visweswaranshyam knowledgebasedvariableselectionforlearningrulesfromproteomicdata AT bowserrobertp knowledgebasedvariableselectionforlearningrulesfromproteomicdata AT hoganwilliamr knowledgebasedvariableselectionforlearningrulesfromproteomicdata AT gopalakrishnanvanathi knowledgebasedvariableselectionforlearningrulesfromproteomicdata |