Cargando…
Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions
BACKGROUND: Biclustering is a popular method for identifying under which experimental conditions biological signatures are co-expressed. However, the general biclustering problem is NP-hard, offering room to focus algorithms on specific biological tasks. We hypothesize that conditional co-regulation...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4407105/ https://www.ncbi.nlm.nih.gov/pubmed/25881257 http://dx.doi.org/10.1186/1752-0509-9-S2-S1 |
_version_ | 1782367861426094080 |
---|---|
author | Danziger, Samuel A Reiss, David J Ratushny, Alexander V Smith, Jennifer J Plaisier, Christopher L Aitchison, John D Baliga, Nitin S |
author_facet | Danziger, Samuel A Reiss, David J Ratushny, Alexander V Smith, Jennifer J Plaisier, Christopher L Aitchison, John D Baliga, Nitin S |
author_sort | Danziger, Samuel A |
collection | PubMed |
description | BACKGROUND: Biclustering is a popular method for identifying under which experimental conditions biological signatures are co-expressed. However, the general biclustering problem is NP-hard, offering room to focus algorithms on specific biological tasks. We hypothesize that conditional co-regulation of genes is a key factor in determining cell phenotype and that accurately segregating conditions in biclusters will improve such predictions. Thus, we developed a bicluster sampled coherence metric (BSCM) for determining which conditions and signals should be included in a bicluster. RESULTS: Our BSCM calculates condition and cluster size specific p-values, and we incorporated these into the popular integrated biclustering algorithm cMonkey. We demonstrate that incorporation of our new algorithm significantly improves bicluster co-regulation scores (p-value = 0.009) and GO annotation scores (p-value = 0.004). Additionally, we used a bicluster based signal to predict whether a given experimental condition will result in yeast peroxisome induction. Using the new algorithm, the classifier accuracy improves from 41.9% to 76.1% correct. CONCLUSIONS: We demonstrate that the proposed BSCM helps determine which signals ought to be co-clustered, resulting in more accurately assigned bicluster membership. Furthermore, we show that BSCM can be extended to more accurately detect under which experimental conditions the genes are co-clustered. Features derived from this more accurate analysis of conditional regulation results in a dramatic improvement in the ability to predict a cellular phenotype in yeast. The latest cMonkey is available for download at https://github.com/baliga-lab/cmonkey2. The experimental data and source code featured in this paper is available http://AitchisonLab.com/BSCM. BSCM has been incorporated in the official cMonkey release. |
format | Online Article Text |
id | pubmed-4407105 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44071052015-04-29 Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions Danziger, Samuel A Reiss, David J Ratushny, Alexander V Smith, Jennifer J Plaisier, Christopher L Aitchison, John D Baliga, Nitin S BMC Syst Biol Research BACKGROUND: Biclustering is a popular method for identifying under which experimental conditions biological signatures are co-expressed. However, the general biclustering problem is NP-hard, offering room to focus algorithms on specific biological tasks. We hypothesize that conditional co-regulation of genes is a key factor in determining cell phenotype and that accurately segregating conditions in biclusters will improve such predictions. Thus, we developed a bicluster sampled coherence metric (BSCM) for determining which conditions and signals should be included in a bicluster. RESULTS: Our BSCM calculates condition and cluster size specific p-values, and we incorporated these into the popular integrated biclustering algorithm cMonkey. We demonstrate that incorporation of our new algorithm significantly improves bicluster co-regulation scores (p-value = 0.009) and GO annotation scores (p-value = 0.004). Additionally, we used a bicluster based signal to predict whether a given experimental condition will result in yeast peroxisome induction. Using the new algorithm, the classifier accuracy improves from 41.9% to 76.1% correct. CONCLUSIONS: We demonstrate that the proposed BSCM helps determine which signals ought to be co-clustered, resulting in more accurately assigned bicluster membership. Furthermore, we show that BSCM can be extended to more accurately detect under which experimental conditions the genes are co-clustered. Features derived from this more accurate analysis of conditional regulation results in a dramatic improvement in the ability to predict a cellular phenotype in yeast. The latest cMonkey is available for download at https://github.com/baliga-lab/cmonkey2. The experimental data and source code featured in this paper is available http://AitchisonLab.com/BSCM. BSCM has been incorporated in the official cMonkey release. BioMed Central 2015-04-15 /pmc/articles/PMC4407105/ /pubmed/25881257 http://dx.doi.org/10.1186/1752-0509-9-S2-S1 Text en Copyright © 2015 Danziger et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Danziger, Samuel A Reiss, David J Ratushny, Alexander V Smith, Jennifer J Plaisier, Christopher L Aitchison, John D Baliga, Nitin S Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions |
title | Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions |
title_full | Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions |
title_fullStr | Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions |
title_full_unstemmed | Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions |
title_short | Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions |
title_sort | bicluster sampled coherence metric (bscm) provides an accurate environmental context for phenotype predictions |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4407105/ https://www.ncbi.nlm.nih.gov/pubmed/25881257 http://dx.doi.org/10.1186/1752-0509-9-S2-S1 |
work_keys_str_mv | AT danzigersamuela biclustersampledcoherencemetricbscmprovidesanaccurateenvironmentalcontextforphenotypepredictions AT reissdavidj biclustersampledcoherencemetricbscmprovidesanaccurateenvironmentalcontextforphenotypepredictions AT ratushnyalexanderv biclustersampledcoherencemetricbscmprovidesanaccurateenvironmentalcontextforphenotypepredictions AT smithjenniferj biclustersampledcoherencemetricbscmprovidesanaccurateenvironmentalcontextforphenotypepredictions AT plaisierchristopherl biclustersampledcoherencemetricbscmprovidesanaccurateenvironmentalcontextforphenotypepredictions AT aitchisonjohnd biclustersampledcoherencemetricbscmprovidesanaccurateenvironmentalcontextforphenotypepredictions AT baliganitins biclustersampledcoherencemetricbscmprovidesanaccurateenvironmentalcontextforphenotypepredictions |