Cargando…

Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing

BACKGROUND: Few environments have been developed or deployed to widely share biomolecular simulation data or to enable collaborative networks to facilitate data exploration and reuse. As the amount and complexity of data generated by these simulations is dramatically increasing and the methods are b...

Descripción completa

Detalles Bibliográficos
Autores principales:	Thibault, Julien C, Roe, Daniel R, Facelli, Julio C, Cheatham, Thomas E
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2014
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3915074/ https://www.ncbi.nlm.nih.gov/pubmed/24484917 http://dx.doi.org/10.1186/1758-2946-6-4

_version_	1782302518721642496
author	Thibault, Julien C Roe, Daniel R Facelli, Julio C Cheatham, Thomas E
author_facet	Thibault, Julien C Roe, Daniel R Facelli, Julio C Cheatham, Thomas E
author_sort	Thibault, Julien C
collection	PubMed
description	BACKGROUND: Few environments have been developed or deployed to widely share biomolecular simulation data or to enable collaborative networks to facilitate data exploration and reuse. As the amount and complexity of data generated by these simulations is dramatically increasing and the methods are being more widely applied, the need for new tools to manage and share this data has become obvious. In this paper we present the results of a process aimed at assessing the needs of the community for data representation standards to guide the implementation of future repositories for biomolecular simulations. RESULTS: We introduce a list of common data elements, inspired by previous work, and updated according to feedback from the community collected through a survey and personal interviews. These data elements integrate the concepts for multiple types of computational methods, including quantum chemistry and molecular dynamics. The identified core data elements were organized into a logical model to guide the design of new databases and application programming interfaces. Finally a set of dictionaries was implemented to be used via SQL queries or locally via a Java API built upon the Apache Lucene text-search engine. CONCLUSIONS: The model and its associated dictionaries provide a simple yet rich representation of the concepts related to biomolecular simulations, which should guide future developments of repositories and more complex terminologies and ontologies. The model still remains extensible through the decomposition of virtual experiments into tasks and parameter sets, and via the use of extended attributes. The benefits of a common logical model for biomolecular simulations was illustrated through various use cases, including data storage, indexing, and presentation. All the models and dictionaries introduced in this paper are available for download at http://ibiomes.chpc.utah.edu/mediawiki/index.php/Downloads.
format	Online Article Text
id	pubmed-3915074
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-39150742014-02-07 Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing Thibault, Julien C Roe, Daniel R Facelli, Julio C Cheatham, Thomas E J Cheminform Methodology BACKGROUND: Few environments have been developed or deployed to widely share biomolecular simulation data or to enable collaborative networks to facilitate data exploration and reuse. As the amount and complexity of data generated by these simulations is dramatically increasing and the methods are being more widely applied, the need for new tools to manage and share this data has become obvious. In this paper we present the results of a process aimed at assessing the needs of the community for data representation standards to guide the implementation of future repositories for biomolecular simulations. RESULTS: We introduce a list of common data elements, inspired by previous work, and updated according to feedback from the community collected through a survey and personal interviews. These data elements integrate the concepts for multiple types of computational methods, including quantum chemistry and molecular dynamics. The identified core data elements were organized into a logical model to guide the design of new databases and application programming interfaces. Finally a set of dictionaries was implemented to be used via SQL queries or locally via a Java API built upon the Apache Lucene text-search engine. CONCLUSIONS: The model and its associated dictionaries provide a simple yet rich representation of the concepts related to biomolecular simulations, which should guide future developments of repositories and more complex terminologies and ontologies. The model still remains extensible through the decomposition of virtual experiments into tasks and parameter sets, and via the use of extended attributes. The benefits of a common logical model for biomolecular simulations was illustrated through various use cases, including data storage, indexing, and presentation. All the models and dictionaries introduced in this paper are available for download at http://ibiomes.chpc.utah.edu/mediawiki/index.php/Downloads. BioMed Central 2014-01-30 /pmc/articles/PMC3915074/ /pubmed/24484917 http://dx.doi.org/10.1186/1758-2946-6-4 Text en Copyright © 2014 Thibault et al.; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Thibault, Julien C Roe, Daniel R Facelli, Julio C Cheatham, Thomas E Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing
title	Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing
title_full	Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing
title_fullStr	Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing
title_full_unstemmed	Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing
title_short	Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing
title_sort	data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3915074/ https://www.ncbi.nlm.nih.gov/pubmed/24484917 http://dx.doi.org/10.1186/1758-2946-6-4
work_keys_str_mv	AT thibaultjulienc datamodeldictionariesanddesiderataforbiomolecularsimulationdataindexingandsharing AT roedanielr datamodeldictionariesanddesiderataforbiomolecularsimulationdataindexingandsharing AT facellijulioc datamodeldictionariesanddesiderataforbiomolecularsimulationdataindexingandsharing AT cheathamthomase datamodeldictionariesanddesiderataforbiomolecularsimulationdataindexingandsharing

Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing

Ejemplares similares