Cargando…

Accelerated search for biomolecular network models to interpret high-throughput experimental data

BACKGROUND: The functions of human cells are carried out by biomolecular networks, which include proteins, genes, and regulatory sites within DNA that encode and control protein expression. Models of biomolecular network structure and dynamics can be inferred from high-throughput measurements of gen...

Descripción completa

Detalles Bibliográficos
Autores principales:	Datta, Suman, Sokhansanj, Bahrad A
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1940030/ https://www.ncbi.nlm.nih.gov/pubmed/17640351 http://dx.doi.org/10.1186/1471-2105-8-258

_version_	1782134432181780480
author	Datta, Suman Sokhansanj, Bahrad A
author_facet	Datta, Suman Sokhansanj, Bahrad A
author_sort	Datta, Suman
collection	PubMed
description	BACKGROUND: The functions of human cells are carried out by biomolecular networks, which include proteins, genes, and regulatory sites within DNA that encode and control protein expression. Models of biomolecular network structure and dynamics can be inferred from high-throughput measurements of gene and protein expression. We build on our previously developed fuzzy logic method for bridging quantitative and qualitative biological data to address the challenges of noisy, low resolution high-throughput measurements, i.e., from gene expression microarrays. We employ an evolutionary search algorithm to accelerate the search for hypothetical fuzzy biomolecular network models consistent with a biological data set. We also develop a method to estimate the probability of a potential network model fitting a set of data by chance. The resulting metric provides an estimate of both model quality and dataset quality, identifying data that are too noisy to identify meaningful correlations between the measured variables. RESULTS: Optimal parameters for the evolutionary search were identified based on artificial data, and the algorithm showed scalable and consistent performance for as many as 150 variables. The method was tested on previously published human cell cycle gene expression microarray data sets. The evolutionary search method was found to converge to the results of exhaustive search. The randomized evolutionary search was able to converge on a set of similar best-fitting network models on different training data sets after 30 generations running 30 models per generation. Consistent results were found regardless of which of the published data sets were used to train or verify the quantitative predictions of the best-fitting models for cell cycle gene dynamics. CONCLUSION: Our results demonstrate the capability of scalable evolutionary search for fuzzy network models to address the problem of inferring models based on complex, noisy biomolecular data sets. This approach yields multiple alternative models that are consistent with the data, yielding a constrained set of hypotheses that can be used to optimally design subsequent experiments.
format	Text
id	pubmed-1940030
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-19400302007-08-07 Accelerated search for biomolecular network models to interpret high-throughput experimental data Datta, Suman Sokhansanj, Bahrad A BMC Bioinformatics Methodology Article BACKGROUND: The functions of human cells are carried out by biomolecular networks, which include proteins, genes, and regulatory sites within DNA that encode and control protein expression. Models of biomolecular network structure and dynamics can be inferred from high-throughput measurements of gene and protein expression. We build on our previously developed fuzzy logic method for bridging quantitative and qualitative biological data to address the challenges of noisy, low resolution high-throughput measurements, i.e., from gene expression microarrays. We employ an evolutionary search algorithm to accelerate the search for hypothetical fuzzy biomolecular network models consistent with a biological data set. We also develop a method to estimate the probability of a potential network model fitting a set of data by chance. The resulting metric provides an estimate of both model quality and dataset quality, identifying data that are too noisy to identify meaningful correlations between the measured variables. RESULTS: Optimal parameters for the evolutionary search were identified based on artificial data, and the algorithm showed scalable and consistent performance for as many as 150 variables. The method was tested on previously published human cell cycle gene expression microarray data sets. The evolutionary search method was found to converge to the results of exhaustive search. The randomized evolutionary search was able to converge on a set of similar best-fitting network models on different training data sets after 30 generations running 30 models per generation. Consistent results were found regardless of which of the published data sets were used to train or verify the quantitative predictions of the best-fitting models for cell cycle gene dynamics. CONCLUSION: Our results demonstrate the capability of scalable evolutionary search for fuzzy network models to address the problem of inferring models based on complex, noisy biomolecular data sets. This approach yields multiple alternative models that are consistent with the data, yielding a constrained set of hypotheses that can be used to optimally design subsequent experiments. BioMed Central 2007-07-18 /pmc/articles/PMC1940030/ /pubmed/17640351 http://dx.doi.org/10.1186/1471-2105-8-258 Text en Copyright © 2007 Datta and Sokhansanj; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Datta, Suman Sokhansanj, Bahrad A Accelerated search for biomolecular network models to interpret high-throughput experimental data
title	Accelerated search for biomolecular network models to interpret high-throughput experimental data
title_full	Accelerated search for biomolecular network models to interpret high-throughput experimental data
title_fullStr	Accelerated search for biomolecular network models to interpret high-throughput experimental data
title_full_unstemmed	Accelerated search for biomolecular network models to interpret high-throughput experimental data
title_short	Accelerated search for biomolecular network models to interpret high-throughput experimental data
title_sort	accelerated search for biomolecular network models to interpret high-throughput experimental data
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1940030/ https://www.ncbi.nlm.nih.gov/pubmed/17640351 http://dx.doi.org/10.1186/1471-2105-8-258
work_keys_str_mv	AT dattasuman acceleratedsearchforbiomolecularnetworkmodelstointerprethighthroughputexperimentaldata AT sokhansanjbahrada acceleratedsearchforbiomolecularnetworkmodelstointerprethighthroughputexperimentaldata

Accelerated search for biomolecular network models to interpret high-throughput experimental data

Ejemplares similares