Cargando…
A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor
An auto-generated thermoelectric-materials database is presented, containing 22,805 data records, automatically generated from the scientific literature, spanning 10,641 unique extracted chemical names. Each record contains a chemical entity and one of the seminal thermoelectric properties: thermoel...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9587980/ https://www.ncbi.nlm.nih.gov/pubmed/36272983 http://dx.doi.org/10.1038/s41597-022-01752-1 |
_version_ | 1784814024768094208 |
---|---|
author | Sierepeklis, Odysseas Cole, Jacqueline M. |
author_facet | Sierepeklis, Odysseas Cole, Jacqueline M. |
author_sort | Sierepeklis, Odysseas |
collection | PubMed |
description | An auto-generated thermoelectric-materials database is presented, containing 22,805 data records, automatically generated from the scientific literature, spanning 10,641 unique extracted chemical names. Each record contains a chemical entity and one of the seminal thermoelectric properties: thermoelectric figure of merit, ZT; thermal conductivity, κ; Seebeck coefficient, S; electrical conductivity, σ; power factor, PF; each linked to their corresponding recorded temperature, T. The database was auto-generated using the automatic sentence-parsing capabilities of the chemistry-aware, natural language processing toolkit, ChemDataExtractor 2.0, adapted for application in the thermoelectric-materials domain, following a rule-based sentence-simplification step. Data were mined from the text of 60,843 scientific papers that were sourced from three scientific publishers: Elsevier, the Royal Society of Chemistry, and Springer. To the best of our knowledge, this is the first automatically-generated database of thermoelectric materials and their properties from existing literature. The database was evaluated to have a precision of 82.25% and has been made publicly available to facilitate the application of data science in the thermoelectric-materials domain, for analysis, design, and prediction. |
format | Online Article Text |
id | pubmed-9587980 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-95879802022-10-24 A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor Sierepeklis, Odysseas Cole, Jacqueline M. Sci Data Data Descriptor An auto-generated thermoelectric-materials database is presented, containing 22,805 data records, automatically generated from the scientific literature, spanning 10,641 unique extracted chemical names. Each record contains a chemical entity and one of the seminal thermoelectric properties: thermoelectric figure of merit, ZT; thermal conductivity, κ; Seebeck coefficient, S; electrical conductivity, σ; power factor, PF; each linked to their corresponding recorded temperature, T. The database was auto-generated using the automatic sentence-parsing capabilities of the chemistry-aware, natural language processing toolkit, ChemDataExtractor 2.0, adapted for application in the thermoelectric-materials domain, following a rule-based sentence-simplification step. Data were mined from the text of 60,843 scientific papers that were sourced from three scientific publishers: Elsevier, the Royal Society of Chemistry, and Springer. To the best of our knowledge, this is the first automatically-generated database of thermoelectric materials and their properties from existing literature. The database was evaluated to have a precision of 82.25% and has been made publicly available to facilitate the application of data science in the thermoelectric-materials domain, for analysis, design, and prediction. Nature Publishing Group UK 2022-10-22 /pmc/articles/PMC9587980/ /pubmed/36272983 http://dx.doi.org/10.1038/s41597-022-01752-1 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Data Descriptor Sierepeklis, Odysseas Cole, Jacqueline M. A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor |
title | A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor |
title_full | A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor |
title_fullStr | A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor |
title_full_unstemmed | A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor |
title_short | A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor |
title_sort | thermoelectric materials database auto-generated from the scientific literature using chemdataextractor |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9587980/ https://www.ncbi.nlm.nih.gov/pubmed/36272983 http://dx.doi.org/10.1038/s41597-022-01752-1 |
work_keys_str_mv | AT sierepeklisodysseas athermoelectricmaterialsdatabaseautogeneratedfromthescientificliteratureusingchemdataextractor AT colejacquelinem athermoelectricmaterialsdatabaseautogeneratedfromthescientificliteratureusingchemdataextractor AT sierepeklisodysseas thermoelectricmaterialsdatabaseautogeneratedfromthescientificliteratureusingchemdataextractor AT colejacquelinem thermoelectricmaterialsdatabaseautogeneratedfromthescientificliteratureusingchemdataextractor |