Cargando…

The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age

Computational Quantum Chemistry has developed into a powerful, efficient, reliable and increasingly routine tool for exploring the structure and properties of small to medium sized molecules. Many thousands of calculations are performed every day, some offering results which approach experimental ac...

Descripción completa

Detalles Bibliográficos
Autores principales: Adams, Sam, de Castro, Pablo, Echenique, Pablo, Estrada, Jorge, Hanwell, Marcus D, Murray-Rust, Peter, Sherwood, Paul, Thomas, Jens, Townsend, Joe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3206452/
https://www.ncbi.nlm.nih.gov/pubmed/21999363
http://dx.doi.org/10.1186/1758-2946-3-38
_version_ 1782215439010496512
author Adams, Sam
de Castro, Pablo
Echenique, Pablo
Estrada, Jorge
Hanwell, Marcus D
Murray-Rust, Peter
Sherwood, Paul
Thomas, Jens
Townsend, Joe
author_facet Adams, Sam
de Castro, Pablo
Echenique, Pablo
Estrada, Jorge
Hanwell, Marcus D
Murray-Rust, Peter
Sherwood, Paul
Thomas, Jens
Townsend, Joe
author_sort Adams, Sam
collection PubMed
description Computational Quantum Chemistry has developed into a powerful, efficient, reliable and increasingly routine tool for exploring the structure and properties of small to medium sized molecules. Many thousands of calculations are performed every day, some offering results which approach experimental accuracy. However, in contrast to other disciplines, such as crystallography, or bioinformatics, where standard formats and well-known, unified databases exist, this QC data is generally destined to remain locally held in files which are not designed to be machine-readable. Only a very small subset of these results will become accessible to the wider community through publication. In this paper we describe how the Quixote Project is developing the infrastructure required to convert output from a number of different molecular quantum chemistry packages to a common semantically rich, machine-readable format and to build respositories of QC results. Such an infrastructure offers benefits at many levels. The standardised representation of the results will facilitate software interoperability, for example making it easier for analysis tools to take data from different QC packages, and will also help with archival and deposition of results. The repository infrastructure, which is lightweight and built using Open software components, can be implemented at individual researcher, project, organisation or community level, offering the exciting possibility that in future many of these QC results can be made publically available, to be searched and interpreted just as crystallography and bioinformatics results are today. Although we believe that quantum chemists will appreciate the contribution the Quixote infrastructure can make to the organisation and and exchange of their results, we anticipate that greater rewards will come from enabling their results to be consumed by a wider community. As the respositories grow they will become a valuable source of chemical data for use by other disciplines in both research and education. The Quixote project is unconventional in that the infrastructure is being implemented in advance of a full definition of the data model which will eventually underpin it. We believe that a working system which offers real value to researchers based on tools and shared, searchable repositories will encourage early participation from a broader community, including both producers and consumers of data. In the early stages, searching and indexing can be performed on the chemical subject of the calculations, and well defined calculation meta-data. The process of defining more specific quantum chemical definitions, adding them to dictionaries and extracting them consistently from the results of the various software packages can then proceed in an incremental manner, adding additional value at each stage. Not only will these results help to change the data management model in the field of Quantum Chemistry, but the methodology can be applied to other pressing problems related to data in computational and experimental science.
format Online
Article
Text
id pubmed-3206452
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32064522011-11-03 The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age Adams, Sam de Castro, Pablo Echenique, Pablo Estrada, Jorge Hanwell, Marcus D Murray-Rust, Peter Sherwood, Paul Thomas, Jens Townsend, Joe J Cheminform Research Article Computational Quantum Chemistry has developed into a powerful, efficient, reliable and increasingly routine tool for exploring the structure and properties of small to medium sized molecules. Many thousands of calculations are performed every day, some offering results which approach experimental accuracy. However, in contrast to other disciplines, such as crystallography, or bioinformatics, where standard formats and well-known, unified databases exist, this QC data is generally destined to remain locally held in files which are not designed to be machine-readable. Only a very small subset of these results will become accessible to the wider community through publication. In this paper we describe how the Quixote Project is developing the infrastructure required to convert output from a number of different molecular quantum chemistry packages to a common semantically rich, machine-readable format and to build respositories of QC results. Such an infrastructure offers benefits at many levels. The standardised representation of the results will facilitate software interoperability, for example making it easier for analysis tools to take data from different QC packages, and will also help with archival and deposition of results. The repository infrastructure, which is lightweight and built using Open software components, can be implemented at individual researcher, project, organisation or community level, offering the exciting possibility that in future many of these QC results can be made publically available, to be searched and interpreted just as crystallography and bioinformatics results are today. Although we believe that quantum chemists will appreciate the contribution the Quixote infrastructure can make to the organisation and and exchange of their results, we anticipate that greater rewards will come from enabling their results to be consumed by a wider community. As the respositories grow they will become a valuable source of chemical data for use by other disciplines in both research and education. The Quixote project is unconventional in that the infrastructure is being implemented in advance of a full definition of the data model which will eventually underpin it. We believe that a working system which offers real value to researchers based on tools and shared, searchable repositories will encourage early participation from a broader community, including both producers and consumers of data. In the early stages, searching and indexing can be performed on the chemical subject of the calculations, and well defined calculation meta-data. The process of defining more specific quantum chemical definitions, adding them to dictionaries and extracting them consistently from the results of the various software packages can then proceed in an incremental manner, adding additional value at each stage. Not only will these results help to change the data management model in the field of Quantum Chemistry, but the methodology can be applied to other pressing problems related to data in computational and experimental science. BioMed Central 2011-10-14 /pmc/articles/PMC3206452/ /pubmed/21999363 http://dx.doi.org/10.1186/1758-2946-3-38 Text en Copyright ©2011 Adams et al; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Adams, Sam
de Castro, Pablo
Echenique, Pablo
Estrada, Jorge
Hanwell, Marcus D
Murray-Rust, Peter
Sherwood, Paul
Thomas, Jens
Townsend, Joe
The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age
title The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age
title_full The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age
title_fullStr The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age
title_full_unstemmed The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age
title_short The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age
title_sort quixote project: collaborative and open quantum chemistry data management in the internet age
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3206452/
https://www.ncbi.nlm.nih.gov/pubmed/21999363
http://dx.doi.org/10.1186/1758-2946-3-38
work_keys_str_mv AT adamssam thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT decastropablo thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT echeniquepablo thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT estradajorge thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT hanwellmarcusd thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT murrayrustpeter thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT sherwoodpaul thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT thomasjens thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT townsendjoe thequixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT adamssam quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT decastropablo quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT echeniquepablo quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT estradajorge quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT hanwellmarcusd quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT murrayrustpeter quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT sherwoodpaul quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT thomasjens quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage
AT townsendjoe quixoteprojectcollaborativeandopenquantumchemistrydatamanagementintheinternetage