Cargando…

Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction

Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatu...

Descripción completa

Detalles Bibliográficos
Autores principales: Court, Callum J., Cole, Jacqueline M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6007086/
https://www.ncbi.nlm.nih.gov/pubmed/29917013
http://dx.doi.org/10.1038/sdata.2018.111
_version_ 1783332969921904640
author Court, Callum J.
Cole, Jacqueline M.
author_facet Court, Callum J.
Cole, Jacqueline M.
author_sort Court, Callum J.
collection PubMed
description Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatures. The database was produced using natural language processing and semi-supervised quaternary relationship extraction, applied to a corpus of 68,078 chemistry and physics articles. Evaluation of the database shows an estimated overall precision of 73%. Therein, records processed with the text-mining toolkit, ChemDataExtractor, were assisted by a modified Snowball algorithm, whose original binary relationship extraction capabilities were extended to quaternary relationship extraction. Consequently, its machine learning component can now train with ≤ 500 seeds, rather than the 4,000 originally used. Data processed with the modified Snowball algorithm affords 82% precision. Database records are available in MongoDB, CSV and JSON formats which can easily be read using Python, R, Java and MatLab. This makes the database easy to query for tackling big-data materials science initiatives and provides a basis for magnetic materials discovery.
format Online
Article
Text
id pubmed-6007086
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-60070862018-06-27 Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction Court, Callum J. Cole, Jacqueline M. Sci Data Data Descriptor Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatures. The database was produced using natural language processing and semi-supervised quaternary relationship extraction, applied to a corpus of 68,078 chemistry and physics articles. Evaluation of the database shows an estimated overall precision of 73%. Therein, records processed with the text-mining toolkit, ChemDataExtractor, were assisted by a modified Snowball algorithm, whose original binary relationship extraction capabilities were extended to quaternary relationship extraction. Consequently, its machine learning component can now train with ≤ 500 seeds, rather than the 4,000 originally used. Data processed with the modified Snowball algorithm affords 82% precision. Database records are available in MongoDB, CSV and JSON formats which can easily be read using Python, R, Java and MatLab. This makes the database easy to query for tackling big-data materials science initiatives and provides a basis for magnetic materials discovery. Nature Publishing Group 2018-06-19 /pmc/articles/PMC6007086/ /pubmed/29917013 http://dx.doi.org/10.1038/sdata.2018.111 Text en Copyright © 2018, The Author(s) http://creativecommons.org/licenses/by/4.0/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article.
spellingShingle Data Descriptor
Court, Callum J.
Cole, Jacqueline M.
Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
title Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
title_full Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
title_fullStr Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
title_full_unstemmed Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
title_short Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
title_sort auto-generated materials database of curie and néel temperatures via semi-supervised relationship extraction
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6007086/
https://www.ncbi.nlm.nih.gov/pubmed/29917013
http://dx.doi.org/10.1038/sdata.2018.111
work_keys_str_mv AT courtcallumj autogeneratedmaterialsdatabaseofcurieandneeltemperaturesviasemisupervisedrelationshipextraction
AT colejacquelinem autogeneratedmaterialsdatabaseofcurieandneeltemperaturesviasemisupervisedrelationshipextraction