Cargando…

Ultra-Structure database design methodology for managing systems biology data and analyses

BACKGROUND: Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biologic...

Descripción completa

Detalles Bibliográficos
Autores principales:	Maier, Christopher W, Long, Jeffrey G, Hemminger, Bradley M, Giddings, Morgan C
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2748085/ https://www.ncbi.nlm.nih.gov/pubmed/19691849 http://dx.doi.org/10.1186/1471-2105-10-254

_version_	1782172130608152576
author	Maier, Christopher W Long, Jeffrey G Hemminger, Bradley M Giddings, Morgan C
author_facet	Maier, Christopher W Long, Jeffrey G Hemminger, Bradley M Giddings, Morgan C
author_sort	Maier, Christopher W
collection	PubMed
description	BACKGROUND: Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping). RESULTS: We transitioned our proteogenomic mapping information system from a traditional entity-relationship design to one based on Ultra-Structure. Our system integrates tandem mass spectrum data, genomic annotation sets, and spectrum/peptide mappings, all within a small, general framework implemented within a standard relational database system. General software procedures driven by user-modifiable rules can perform tasks such as logical deduction and location-based computations. The system is not tied specifically to proteogenomic research, but is rather designed to accommodate virtually any kind of biological research. CONCLUSION: We find Ultra-Structure offers substantial benefits for biological information systems, the largest being the integration of diverse information sources into a common framework. This facilitates systems biology research by integrating data from disparate high-throughput techniques. It also enables us to readily incorporate new data types, sources, and domain knowledge with no change to the database structure or associated computer code. Ultra-Structure may be a significant step towards solving the hard problem of data management and integration in the systems biology era.
format	Text
id	pubmed-2748085
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-27480852009-09-22 Ultra-Structure database design methodology for managing systems biology data and analyses Maier, Christopher W Long, Jeffrey G Hemminger, Bradley M Giddings, Morgan C BMC Bioinformatics Methodology Article BACKGROUND: Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping). RESULTS: We transitioned our proteogenomic mapping information system from a traditional entity-relationship design to one based on Ultra-Structure. Our system integrates tandem mass spectrum data, genomic annotation sets, and spectrum/peptide mappings, all within a small, general framework implemented within a standard relational database system. General software procedures driven by user-modifiable rules can perform tasks such as logical deduction and location-based computations. The system is not tied specifically to proteogenomic research, but is rather designed to accommodate virtually any kind of biological research. CONCLUSION: We find Ultra-Structure offers substantial benefits for biological information systems, the largest being the integration of diverse information sources into a common framework. This facilitates systems biology research by integrating data from disparate high-throughput techniques. It also enables us to readily incorporate new data types, sources, and domain knowledge with no change to the database structure or associated computer code. Ultra-Structure may be a significant step towards solving the hard problem of data management and integration in the systems biology era. BioMed Central 2009-08-19 /pmc/articles/PMC2748085/ /pubmed/19691849 http://dx.doi.org/10.1186/1471-2105-10-254 Text en Copyright © 2009 Maier et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Maier, Christopher W Long, Jeffrey G Hemminger, Bradley M Giddings, Morgan C Ultra-Structure database design methodology for managing systems biology data and analyses
title	Ultra-Structure database design methodology for managing systems biology data and analyses
title_full	Ultra-Structure database design methodology for managing systems biology data and analyses
title_fullStr	Ultra-Structure database design methodology for managing systems biology data and analyses
title_full_unstemmed	Ultra-Structure database design methodology for managing systems biology data and analyses
title_short	Ultra-Structure database design methodology for managing systems biology data and analyses
title_sort	ultra-structure database design methodology for managing systems biology data and analyses
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2748085/ https://www.ncbi.nlm.nih.gov/pubmed/19691849 http://dx.doi.org/10.1186/1471-2105-10-254
work_keys_str_mv	AT maierchristopherw ultrastructuredatabasedesignmethodologyformanagingsystemsbiologydataandanalyses AT longjeffreyg ultrastructuredatabasedesignmethodologyformanagingsystemsbiologydataandanalyses AT hemmingerbradleym ultrastructuredatabasedesignmethodologyformanagingsystemsbiologydataandanalyses AT giddingsmorganc ultrastructuredatabasedesignmethodologyformanagingsystemsbiologydataandanalyses

Ultra-Structure database design methodology for managing systems biology data and analyses

Ejemplares similares