Cargando…

QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules

We introduce QM7-X, a comprehensive dataset of 42 physicochemical properties for ≈4.2 million equilibrium and non-equilibrium structures of small organic molecules with up to seven non-hydrogen (C, N, O, S, Cl) atoms. To span this fundamentally important region of chemical compound space (CCS), QM7-...

Descripción completa

Detalles Bibliográficos
Autores principales: Hoja, Johannes, Medrano Sandonas, Leonardo, Ernst, Brian G., Vazquez-Mayagoitia, Alvaro, DiStasio Jr., Robert A., Tkatchenko, Alexandre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7854709/
https://www.ncbi.nlm.nih.gov/pubmed/33531509
http://dx.doi.org/10.1038/s41597-021-00812-2
_version_ 1783646139404255232
author Hoja, Johannes
Medrano Sandonas, Leonardo
Ernst, Brian G.
Vazquez-Mayagoitia, Alvaro
DiStasio Jr., Robert A.
Tkatchenko, Alexandre
author_facet Hoja, Johannes
Medrano Sandonas, Leonardo
Ernst, Brian G.
Vazquez-Mayagoitia, Alvaro
DiStasio Jr., Robert A.
Tkatchenko, Alexandre
author_sort Hoja, Johannes
collection PubMed
description We introduce QM7-X, a comprehensive dataset of 42 physicochemical properties for ≈4.2 million equilibrium and non-equilibrium structures of small organic molecules with up to seven non-hydrogen (C, N, O, S, Cl) atoms. To span this fundamentally important region of chemical compound space (CCS), QM7-X includes an exhaustive sampling of (meta-)stable equilibrium structures—comprised of constitutional/structural isomers and stereoisomers, e.g., enantiomers and diastereomers (including cis-/trans- and conformational isomers)—as well as 100 non-equilibrium structural variations thereof to reach a total of ≈4.2 million molecular structures. Computed at the tightly converged quantum-mechanical PBE0+MBD level of theory, QM7-X contains global (molecular) and local (atom-in-a-molecule) properties ranging from ground state quantities (such as atomization energies and dipole moments) to response quantities (such as polarizability tensors and dispersion coefficients). By providing a systematic, extensive, and tightly-converged dataset of quantum-mechanically computed physicochemical properties, we expect that QM7-X will play a critical role in the development of next-generation machine-learning based models for exploring greater swaths of CCS and performing in silico design of molecules with targeted properties.
format Online
Article
Text
id pubmed-7854709
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-78547092021-02-11 QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules Hoja, Johannes Medrano Sandonas, Leonardo Ernst, Brian G. Vazquez-Mayagoitia, Alvaro DiStasio Jr., Robert A. Tkatchenko, Alexandre Sci Data Data Descriptor We introduce QM7-X, a comprehensive dataset of 42 physicochemical properties for ≈4.2 million equilibrium and non-equilibrium structures of small organic molecules with up to seven non-hydrogen (C, N, O, S, Cl) atoms. To span this fundamentally important region of chemical compound space (CCS), QM7-X includes an exhaustive sampling of (meta-)stable equilibrium structures—comprised of constitutional/structural isomers and stereoisomers, e.g., enantiomers and diastereomers (including cis-/trans- and conformational isomers)—as well as 100 non-equilibrium structural variations thereof to reach a total of ≈4.2 million molecular structures. Computed at the tightly converged quantum-mechanical PBE0+MBD level of theory, QM7-X contains global (molecular) and local (atom-in-a-molecule) properties ranging from ground state quantities (such as atomization energies and dipole moments) to response quantities (such as polarizability tensors and dispersion coefficients). By providing a systematic, extensive, and tightly-converged dataset of quantum-mechanically computed physicochemical properties, we expect that QM7-X will play a critical role in the development of next-generation machine-learning based models for exploring greater swaths of CCS and performing in silico design of molecules with targeted properties. Nature Publishing Group UK 2021-02-02 /pmc/articles/PMC7854709/ /pubmed/33531509 http://dx.doi.org/10.1038/s41597-021-00812-2 Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.
spellingShingle Data Descriptor
Hoja, Johannes
Medrano Sandonas, Leonardo
Ernst, Brian G.
Vazquez-Mayagoitia, Alvaro
DiStasio Jr., Robert A.
Tkatchenko, Alexandre
QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules
title QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules
title_full QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules
title_fullStr QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules
title_full_unstemmed QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules
title_short QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules
title_sort qm7-x, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7854709/
https://www.ncbi.nlm.nih.gov/pubmed/33531509
http://dx.doi.org/10.1038/s41597-021-00812-2
work_keys_str_mv AT hojajohannes qm7xacomprehensivedatasetofquantummechanicalpropertiesspanningthechemicalspaceofsmallorganicmolecules
AT medranosandonasleonardo qm7xacomprehensivedatasetofquantummechanicalpropertiesspanningthechemicalspaceofsmallorganicmolecules
AT ernstbriang qm7xacomprehensivedatasetofquantummechanicalpropertiesspanningthechemicalspaceofsmallorganicmolecules
AT vazquezmayagoitiaalvaro qm7xacomprehensivedatasetofquantummechanicalpropertiesspanningthechemicalspaceofsmallorganicmolecules
AT distasiojrroberta qm7xacomprehensivedatasetofquantummechanicalpropertiesspanningthechemicalspaceofsmallorganicmolecules
AT tkatchenkoalexandre qm7xacomprehensivedatasetofquantummechanicalpropertiesspanningthechemicalspaceofsmallorganicmolecules