Cargando…

Spice: discovery of phenotype-determining component interplays

BACKGROUND: A latent behavior of a biological cell is complex. Deriving the underlying simplicity, or the fundamental rules governing this behavior has been the Holy Grail of systems biology. Data-driven prediction of the system components and their component interplays that are responsible for the...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Zhengzhang, Padmanabhan, Kanchana, Rocha, Andrea M, Shpanskaya, Yekaterina, Mihelcic, James R, Scott, Kathleen, Samatova, Nagiza F
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3515406/
https://www.ncbi.nlm.nih.gov/pubmed/22583800
http://dx.doi.org/10.1186/1752-0509-6-40
_version_ 1782252172910526464
author Chen, Zhengzhang
Padmanabhan, Kanchana
Rocha, Andrea M
Shpanskaya, Yekaterina
Mihelcic, James R
Scott, Kathleen
Samatova, Nagiza F
author_facet Chen, Zhengzhang
Padmanabhan, Kanchana
Rocha, Andrea M
Shpanskaya, Yekaterina
Mihelcic, James R
Scott, Kathleen
Samatova, Nagiza F
author_sort Chen, Zhengzhang
collection PubMed
description BACKGROUND: A latent behavior of a biological cell is complex. Deriving the underlying simplicity, or the fundamental rules governing this behavior has been the Holy Grail of systems biology. Data-driven prediction of the system components and their component interplays that are responsible for the target system’s phenotype is a key and challenging step in this endeavor. RESULTS: The proposed approach, which we call System Phenotype-related Interplaying Components Enumerator (Spice), iteratively enumerates statistically significant system components that are hypothesized (1) to play an important role in defining the specificity of the target system’s phenotype(s); (2) to exhibit a functionally coherent behavior, namely, act in a coordinated manner to perform the phenotype-specific function; and (3) to improve the predictive skill of the system’s phenotype(s) when used collectively in the ensemble of predictive models. Spice can be applied to both instance-based data and network-based data. When validated, Spice effectively identified system components related to three target phenotypes: biohydrogen production, motility, and cancer. Manual results curation agreed with the known phenotype-related system components reported in literature. Additionally, using the identified system components as discriminatory features improved the prediction accuracy by 10% on the phenotype-classification task when compared to a number of state-of-the-art methods applied to eight benchmark microarray data sets. CONCLUSION: We formulate a problem—enumeration of phenotype-determining system component interplays—and propose an effective methodology (Spice) to address this problem. Spice improved identification of cancer-related groups of genes from various microarray data sets and detected groups of genes associated with microbial biohydrogen production and motility, many of which were reported in literature. Spice also improved the predictive skill of the system’s phenotype determination compared to individual classifiers and/or other ensemble methods, such as bagging, boosting, random forest, nearest shrunken centroid, and random forest variable selection method.
format Online
Article
Text
id pubmed-3515406
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35154062012-12-06 Spice: discovery of phenotype-determining component interplays Chen, Zhengzhang Padmanabhan, Kanchana Rocha, Andrea M Shpanskaya, Yekaterina Mihelcic, James R Scott, Kathleen Samatova, Nagiza F BMC Syst Biol Research Article BACKGROUND: A latent behavior of a biological cell is complex. Deriving the underlying simplicity, or the fundamental rules governing this behavior has been the Holy Grail of systems biology. Data-driven prediction of the system components and their component interplays that are responsible for the target system’s phenotype is a key and challenging step in this endeavor. RESULTS: The proposed approach, which we call System Phenotype-related Interplaying Components Enumerator (Spice), iteratively enumerates statistically significant system components that are hypothesized (1) to play an important role in defining the specificity of the target system’s phenotype(s); (2) to exhibit a functionally coherent behavior, namely, act in a coordinated manner to perform the phenotype-specific function; and (3) to improve the predictive skill of the system’s phenotype(s) when used collectively in the ensemble of predictive models. Spice can be applied to both instance-based data and network-based data. When validated, Spice effectively identified system components related to three target phenotypes: biohydrogen production, motility, and cancer. Manual results curation agreed with the known phenotype-related system components reported in literature. Additionally, using the identified system components as discriminatory features improved the prediction accuracy by 10% on the phenotype-classification task when compared to a number of state-of-the-art methods applied to eight benchmark microarray data sets. CONCLUSION: We formulate a problem—enumeration of phenotype-determining system component interplays—and propose an effective methodology (Spice) to address this problem. Spice improved identification of cancer-related groups of genes from various microarray data sets and detected groups of genes associated with microbial biohydrogen production and motility, many of which were reported in literature. Spice also improved the predictive skill of the system’s phenotype determination compared to individual classifiers and/or other ensemble methods, such as bagging, boosting, random forest, nearest shrunken centroid, and random forest variable selection method. BioMed Central 2012-05-14 /pmc/articles/PMC3515406/ /pubmed/22583800 http://dx.doi.org/10.1186/1752-0509-6-40 Text en Copyright ©2012 Chen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Chen, Zhengzhang
Padmanabhan, Kanchana
Rocha, Andrea M
Shpanskaya, Yekaterina
Mihelcic, James R
Scott, Kathleen
Samatova, Nagiza F
Spice: discovery of phenotype-determining component interplays
title Spice: discovery of phenotype-determining component interplays
title_full Spice: discovery of phenotype-determining component interplays
title_fullStr Spice: discovery of phenotype-determining component interplays
title_full_unstemmed Spice: discovery of phenotype-determining component interplays
title_short Spice: discovery of phenotype-determining component interplays
title_sort spice: discovery of phenotype-determining component interplays
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3515406/
https://www.ncbi.nlm.nih.gov/pubmed/22583800
http://dx.doi.org/10.1186/1752-0509-6-40
work_keys_str_mv AT chenzhengzhang spicediscoveryofphenotypedeterminingcomponentinterplays
AT padmanabhankanchana spicediscoveryofphenotypedeterminingcomponentinterplays
AT rochaandream spicediscoveryofphenotypedeterminingcomponentinterplays
AT shpanskayayekaterina spicediscoveryofphenotypedeterminingcomponentinterplays
AT mihelcicjamesr spicediscoveryofphenotypedeterminingcomponentinterplays
AT scottkathleen spicediscoveryofphenotypedeterminingcomponentinterplays
AT samatovanagizaf spicediscoveryofphenotypedeterminingcomponentinterplays