Cargando…

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms

BACKGROUND: The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data set...

Descripción completa

Detalles Bibliográficos
Autores principales: Van den Bulcke, Tim, Van Leemput, Koenraad, Naudts, Bart, van Remortel, Piet, Ma, Hongwu, Verschoren, Alain, De Moor, Bart, Marchal, Kathleen
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1373604/
https://www.ncbi.nlm.nih.gov/pubmed/16438721
http://dx.doi.org/10.1186/1471-2105-7-43
_version_ 1782126796334956544
author Van den Bulcke, Tim
Van Leemput, Koenraad
Naudts, Bart
van Remortel, Piet
Ma, Hongwu
Verschoren, Alain
De Moor, Bart
Marchal, Kathleen
author_facet Van den Bulcke, Tim
Van Leemput, Koenraad
Naudts, Bart
van Remortel, Piet
Ma, Hongwu
Verschoren, Alain
De Moor, Bart
Marchal, Kathleen
author_sort Van den Bulcke, Tim
collection PubMed
description BACKGROUND: The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner. RESULTS: In this paper we describe a network generator that creates synthetic transcriptional regulatory networks and produces simulated gene expression data that approximates experimental data. Network topologies are generated by selecting subnetworks from previously described regulatory networks. Interaction kinetics are modeled by equations based on Michaelis-Menten and Hill kinetics. Our results show that the statistical properties of these topologies more closely approximate those of genuine biological networks than do those of different types of random graph models. Several user-definable parameters adjust the complexity of the resulting data set with respect to the structure learning algorithms. CONCLUSION: This network generation technique offers a valid alternative to existing methods. The topological characteristics of the generated networks more closely resemble the characteristics of real transcriptional networks. Simulation of the network scales well to large networks. The generator models different types of biological interactions and produces biologically plausible synthetic gene expression data.
format Text
id pubmed-1373604
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-13736042006-02-18 SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms Van den Bulcke, Tim Van Leemput, Koenraad Naudts, Bart van Remortel, Piet Ma, Hongwu Verschoren, Alain De Moor, Bart Marchal, Kathleen BMC Bioinformatics Methodology Article BACKGROUND: The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner. RESULTS: In this paper we describe a network generator that creates synthetic transcriptional regulatory networks and produces simulated gene expression data that approximates experimental data. Network topologies are generated by selecting subnetworks from previously described regulatory networks. Interaction kinetics are modeled by equations based on Michaelis-Menten and Hill kinetics. Our results show that the statistical properties of these topologies more closely approximate those of genuine biological networks than do those of different types of random graph models. Several user-definable parameters adjust the complexity of the resulting data set with respect to the structure learning algorithms. CONCLUSION: This network generation technique offers a valid alternative to existing methods. The topological characteristics of the generated networks more closely resemble the characteristics of real transcriptional networks. Simulation of the network scales well to large networks. The generator models different types of biological interactions and produces biologically plausible synthetic gene expression data. BioMed Central 2006-01-26 /pmc/articles/PMC1373604/ /pubmed/16438721 http://dx.doi.org/10.1186/1471-2105-7-43 Text en Copyright © 2006 Van den Bulcke et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Van den Bulcke, Tim
Van Leemput, Koenraad
Naudts, Bart
van Remortel, Piet
Ma, Hongwu
Verschoren, Alain
De Moor, Bart
Marchal, Kathleen
SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
title SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
title_full SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
title_fullStr SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
title_full_unstemmed SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
title_short SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
title_sort syntren: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1373604/
https://www.ncbi.nlm.nih.gov/pubmed/16438721
http://dx.doi.org/10.1186/1471-2105-7-43
work_keys_str_mv AT vandenbulcketim syntrenageneratorofsyntheticgeneexpressiondatafordesignandanalysisofstructurelearningalgorithms
AT vanleemputkoenraad syntrenageneratorofsyntheticgeneexpressiondatafordesignandanalysisofstructurelearningalgorithms
AT naudtsbart syntrenageneratorofsyntheticgeneexpressiondatafordesignandanalysisofstructurelearningalgorithms
AT vanremortelpiet syntrenageneratorofsyntheticgeneexpressiondatafordesignandanalysisofstructurelearningalgorithms
AT mahongwu syntrenageneratorofsyntheticgeneexpressiondatafordesignandanalysisofstructurelearningalgorithms
AT verschorenalain syntrenageneratorofsyntheticgeneexpressiondatafordesignandanalysisofstructurelearningalgorithms
AT demoorbart syntrenageneratorofsyntheticgeneexpressiondatafordesignandanalysisofstructurelearningalgorithms
AT marchalkathleen syntrenageneratorofsyntheticgeneexpressiondatafordesignandanalysisofstructurelearningalgorithms