Cargando…

DWARF – a data warehouse system for analyzing protein families

BACKGROUND: The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformati...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fischer, Markus, Thai, Quan K, Grieb, Melanie, Pleiss, Jürgen
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Database
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1647292/ https://www.ncbi.nlm.nih.gov/pubmed/17094801 http://dx.doi.org/10.1186/1471-2105-7-495

_version_	1782131002378813440
author	Fischer, Markus Thai, Quan K Grieb, Melanie Pleiss, Jürgen
author_facet	Fischer, Markus Thai, Quan K Grieb, Melanie Pleiss, Jürgen
author_sort	Fischer, Markus
collection	PubMed
description	BACKGROUND: The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. DESCRIPTION: The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. CONCLUSION: DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering.
format	Text
id	pubmed-1647292
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-16472922006-11-18 DWARF – a data warehouse system for analyzing protein families Fischer, Markus Thai, Quan K Grieb, Melanie Pleiss, Jürgen BMC Bioinformatics Database BACKGROUND: The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. DESCRIPTION: The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. CONCLUSION: DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering. BioMed Central 2006-11-09 /pmc/articles/PMC1647292/ /pubmed/17094801 http://dx.doi.org/10.1186/1471-2105-7-495 Text en Copyright © 2006 Fischer et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Database Fischer, Markus Thai, Quan K Grieb, Melanie Pleiss, Jürgen DWARF – a data warehouse system for analyzing protein families
title	DWARF – a data warehouse system for analyzing protein families
title_full	DWARF – a data warehouse system for analyzing protein families
title_fullStr	DWARF – a data warehouse system for analyzing protein families
title_full_unstemmed	DWARF – a data warehouse system for analyzing protein families
title_short	DWARF – a data warehouse system for analyzing protein families
title_sort	dwarf – a data warehouse system for analyzing protein families
topic	Database
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1647292/ https://www.ncbi.nlm.nih.gov/pubmed/17094801 http://dx.doi.org/10.1186/1471-2105-7-495
work_keys_str_mv	AT fischermarkus dwarfadatawarehousesystemforanalyzingproteinfamilies AT thaiquank dwarfadatawarehousesystemforanalyzingproteinfamilies AT griebmelanie dwarfadatawarehousesystemforanalyzingproteinfamilies AT pleissjurgen dwarfadatawarehousesystemforanalyzingproteinfamilies

DWARF – a data warehouse system for analyzing protein families

Ejemplares similares