Cargando…

Design and implementation of a generalized laboratory data model

BACKGROUND: Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the impo...

Descripción completa

Detalles Bibliográficos
Autores principales: Wendl, Michael C, Smith, Scott, Pohl, Craig S, Dooling, David J, Chinwalla, Asif T, Crouse, Kevin, Hepler, Todd, Leong, Shin, Carmichael, Lynn, Nhan, Mike, Oberkfell, Benjamin J, Mardis, Elaine R, Hillier, LaDeana W, Wilson, Richard K
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2194795/
https://www.ncbi.nlm.nih.gov/pubmed/17897463
http://dx.doi.org/10.1186/1471-2105-8-362
_version_ 1782147697275305984
author Wendl, Michael C
Smith, Scott
Pohl, Craig S
Dooling, David J
Chinwalla, Asif T
Crouse, Kevin
Hepler, Todd
Leong, Shin
Carmichael, Lynn
Nhan, Mike
Oberkfell, Benjamin J
Mardis, Elaine R
Hillier, LaDeana W
Wilson, Richard K
author_facet Wendl, Michael C
Smith, Scott
Pohl, Craig S
Dooling, David J
Chinwalla, Asif T
Crouse, Kevin
Hepler, Todd
Leong, Shin
Carmichael, Lynn
Nhan, Mike
Oberkfell, Benjamin J
Mardis, Elaine R
Hillier, LaDeana W
Wilson, Richard K
author_sort Wendl, Michael C
collection PubMed
description BACKGROUND: Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the importance of robust information management systems in the modern laboratory. Designing and implementing such systems is non-trivial and it appears that in many cases a database project ultimately proves unserviceable. RESULTS: We describe a general modeling framework for laboratory data and its implementation as an information management system. The model utilizes several abstraction techniques, focusing especially on the concepts of inheritance and meta-data. Traditional approaches commingle event-oriented data with regular entity data in ad hoc ways. Instead, we define distinct regular entity and event schemas, but fully integrate these via a standardized interface. The design allows straightforward definition of a "processing pipeline" as a sequence of events, obviating the need for separate workflow management systems. A layer above the event-oriented schema integrates events into a workflow by defining "processing directives", which act as automated project managers of items in the system. Directives can be added or modified in an almost trivial fashion, i.e., without the need for schema modification or re-certification of applications. Association between regular entities and events is managed via simple "many-to-many" relationships. We describe the programming interface, as well as techniques for handling input/output, process control, and state transitions. CONCLUSION: The implementation described here has served as the Washington University Genome Sequencing Center's primary information system for several years. It handles all transactions underlying a throughput rate of about 9 million sequencing reactions of various kinds per month and has handily weathered a number of major pipeline reconfigurations. The basic data model can be readily adapted to other high-volume processing environments.
format Text
id pubmed-2194795
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-21947952008-01-13 Design and implementation of a generalized laboratory data model Wendl, Michael C Smith, Scott Pohl, Craig S Dooling, David J Chinwalla, Asif T Crouse, Kevin Hepler, Todd Leong, Shin Carmichael, Lynn Nhan, Mike Oberkfell, Benjamin J Mardis, Elaine R Hillier, LaDeana W Wilson, Richard K BMC Bioinformatics Methodology Article BACKGROUND: Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the importance of robust information management systems in the modern laboratory. Designing and implementing such systems is non-trivial and it appears that in many cases a database project ultimately proves unserviceable. RESULTS: We describe a general modeling framework for laboratory data and its implementation as an information management system. The model utilizes several abstraction techniques, focusing especially on the concepts of inheritance and meta-data. Traditional approaches commingle event-oriented data with regular entity data in ad hoc ways. Instead, we define distinct regular entity and event schemas, but fully integrate these via a standardized interface. The design allows straightforward definition of a "processing pipeline" as a sequence of events, obviating the need for separate workflow management systems. A layer above the event-oriented schema integrates events into a workflow by defining "processing directives", which act as automated project managers of items in the system. Directives can be added or modified in an almost trivial fashion, i.e., without the need for schema modification or re-certification of applications. Association between regular entities and events is managed via simple "many-to-many" relationships. We describe the programming interface, as well as techniques for handling input/output, process control, and state transitions. CONCLUSION: The implementation described here has served as the Washington University Genome Sequencing Center's primary information system for several years. It handles all transactions underlying a throughput rate of about 9 million sequencing reactions of various kinds per month and has handily weathered a number of major pipeline reconfigurations. The basic data model can be readily adapted to other high-volume processing environments. BioMed Central 2007-09-26 /pmc/articles/PMC2194795/ /pubmed/17897463 http://dx.doi.org/10.1186/1471-2105-8-362 Text en Copyright © 2007 Wendl et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Wendl, Michael C
Smith, Scott
Pohl, Craig S
Dooling, David J
Chinwalla, Asif T
Crouse, Kevin
Hepler, Todd
Leong, Shin
Carmichael, Lynn
Nhan, Mike
Oberkfell, Benjamin J
Mardis, Elaine R
Hillier, LaDeana W
Wilson, Richard K
Design and implementation of a generalized laboratory data model
title Design and implementation of a generalized laboratory data model
title_full Design and implementation of a generalized laboratory data model
title_fullStr Design and implementation of a generalized laboratory data model
title_full_unstemmed Design and implementation of a generalized laboratory data model
title_short Design and implementation of a generalized laboratory data model
title_sort design and implementation of a generalized laboratory data model
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2194795/
https://www.ncbi.nlm.nih.gov/pubmed/17897463
http://dx.doi.org/10.1186/1471-2105-8-362
work_keys_str_mv AT wendlmichaelc designandimplementationofageneralizedlaboratorydatamodel
AT smithscott designandimplementationofageneralizedlaboratorydatamodel
AT pohlcraigs designandimplementationofageneralizedlaboratorydatamodel
AT doolingdavidj designandimplementationofageneralizedlaboratorydatamodel
AT chinwallaasift designandimplementationofageneralizedlaboratorydatamodel
AT crousekevin designandimplementationofageneralizedlaboratorydatamodel
AT heplertodd designandimplementationofageneralizedlaboratorydatamodel
AT leongshin designandimplementationofageneralizedlaboratorydatamodel
AT carmichaellynn designandimplementationofageneralizedlaboratorydatamodel
AT nhanmike designandimplementationofageneralizedlaboratorydatamodel
AT oberkfellbenjaminj designandimplementationofageneralizedlaboratorydatamodel
AT mardiselainer designandimplementationofageneralizedlaboratorydatamodel
AT hillierladeanaw designandimplementationofageneralizedlaboratorydatamodel
AT wilsonrichardk designandimplementationofageneralizedlaboratorydatamodel