Cargando…

A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor

An auto-generated thermoelectric-materials database is presented, containing 22,805 data records, automatically generated from the scientific literature, spanning 10,641 unique extracted chemical names. Each record contains a chemical entity and one of the seminal thermoelectric properties: thermoel...

Descripción completa

Detalles Bibliográficos
Autores principales: Sierepeklis, Odysseas, Cole, Jacqueline M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9587980/
https://www.ncbi.nlm.nih.gov/pubmed/36272983
http://dx.doi.org/10.1038/s41597-022-01752-1
_version_ 1784814024768094208
author Sierepeklis, Odysseas
Cole, Jacqueline M.
author_facet Sierepeklis, Odysseas
Cole, Jacqueline M.
author_sort Sierepeklis, Odysseas
collection PubMed
description An auto-generated thermoelectric-materials database is presented, containing 22,805 data records, automatically generated from the scientific literature, spanning 10,641 unique extracted chemical names. Each record contains a chemical entity and one of the seminal thermoelectric properties: thermoelectric figure of merit, ZT; thermal conductivity, κ; Seebeck coefficient, S; electrical conductivity, σ; power factor, PF; each linked to their corresponding recorded temperature, T. The database was auto-generated using the automatic sentence-parsing capabilities of the chemistry-aware, natural language processing toolkit, ChemDataExtractor 2.0, adapted for application in the thermoelectric-materials domain, following a rule-based sentence-simplification step. Data were mined from the text of 60,843 scientific papers that were sourced from three scientific publishers: Elsevier, the Royal Society of Chemistry, and Springer. To the best of our knowledge, this is the first automatically-generated database of thermoelectric materials and their properties from existing literature. The database was evaluated to have a precision of 82.25% and has been made publicly available to facilitate the application of data science in the thermoelectric-materials domain, for analysis, design, and prediction.
format Online
Article
Text
id pubmed-9587980
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-95879802022-10-24 A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor Sierepeklis, Odysseas Cole, Jacqueline M. Sci Data Data Descriptor An auto-generated thermoelectric-materials database is presented, containing 22,805 data records, automatically generated from the scientific literature, spanning 10,641 unique extracted chemical names. Each record contains a chemical entity and one of the seminal thermoelectric properties: thermoelectric figure of merit, ZT; thermal conductivity, κ; Seebeck coefficient, S; electrical conductivity, σ; power factor, PF; each linked to their corresponding recorded temperature, T. The database was auto-generated using the automatic sentence-parsing capabilities of the chemistry-aware, natural language processing toolkit, ChemDataExtractor 2.0, adapted for application in the thermoelectric-materials domain, following a rule-based sentence-simplification step. Data were mined from the text of 60,843 scientific papers that were sourced from three scientific publishers: Elsevier, the Royal Society of Chemistry, and Springer. To the best of our knowledge, this is the first automatically-generated database of thermoelectric materials and their properties from existing literature. The database was evaluated to have a precision of 82.25% and has been made publicly available to facilitate the application of data science in the thermoelectric-materials domain, for analysis, design, and prediction. Nature Publishing Group UK 2022-10-22 /pmc/articles/PMC9587980/ /pubmed/36272983 http://dx.doi.org/10.1038/s41597-022-01752-1 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Data Descriptor
Sierepeklis, Odysseas
Cole, Jacqueline M.
A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor
title A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor
title_full A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor
title_fullStr A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor
title_full_unstemmed A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor
title_short A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor
title_sort thermoelectric materials database auto-generated from the scientific literature using chemdataextractor
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9587980/
https://www.ncbi.nlm.nih.gov/pubmed/36272983
http://dx.doi.org/10.1038/s41597-022-01752-1
work_keys_str_mv AT sierepeklisodysseas athermoelectricmaterialsdatabaseautogeneratedfromthescientificliteratureusingchemdataextractor
AT colejacquelinem athermoelectricmaterialsdatabaseautogeneratedfromthescientificliteratureusingchemdataextractor
AT sierepeklisodysseas thermoelectricmaterialsdatabaseautogeneratedfromthescientificliteratureusingchemdataextractor
AT colejacquelinem thermoelectricmaterialsdatabaseautogeneratedfromthescientificliteratureusingchemdataextractor