Cargando…
Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatu...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6007086/ https://www.ncbi.nlm.nih.gov/pubmed/29917013 http://dx.doi.org/10.1038/sdata.2018.111 |
_version_ | 1783332969921904640 |
---|---|
author | Court, Callum J. Cole, Jacqueline M. |
author_facet | Court, Callum J. Cole, Jacqueline M. |
author_sort | Court, Callum J. |
collection | PubMed |
description | Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatures. The database was produced using natural language processing and semi-supervised quaternary relationship extraction, applied to a corpus of 68,078 chemistry and physics articles. Evaluation of the database shows an estimated overall precision of 73%. Therein, records processed with the text-mining toolkit, ChemDataExtractor, were assisted by a modified Snowball algorithm, whose original binary relationship extraction capabilities were extended to quaternary relationship extraction. Consequently, its machine learning component can now train with ≤ 500 seeds, rather than the 4,000 originally used. Data processed with the modified Snowball algorithm affords 82% precision. Database records are available in MongoDB, CSV and JSON formats which can easily be read using Python, R, Java and MatLab. This makes the database easy to query for tackling big-data materials science initiatives and provides a basis for magnetic materials discovery. |
format | Online Article Text |
id | pubmed-6007086 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-60070862018-06-27 Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction Court, Callum J. Cole, Jacqueline M. Sci Data Data Descriptor Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatures. The database was produced using natural language processing and semi-supervised quaternary relationship extraction, applied to a corpus of 68,078 chemistry and physics articles. Evaluation of the database shows an estimated overall precision of 73%. Therein, records processed with the text-mining toolkit, ChemDataExtractor, were assisted by a modified Snowball algorithm, whose original binary relationship extraction capabilities were extended to quaternary relationship extraction. Consequently, its machine learning component can now train with ≤ 500 seeds, rather than the 4,000 originally used. Data processed with the modified Snowball algorithm affords 82% precision. Database records are available in MongoDB, CSV and JSON formats which can easily be read using Python, R, Java and MatLab. This makes the database easy to query for tackling big-data materials science initiatives and provides a basis for magnetic materials discovery. Nature Publishing Group 2018-06-19 /pmc/articles/PMC6007086/ /pubmed/29917013 http://dx.doi.org/10.1038/sdata.2018.111 Text en Copyright © 2018, The Author(s) http://creativecommons.org/licenses/by/4.0/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article. |
spellingShingle | Data Descriptor Court, Callum J. Cole, Jacqueline M. Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction |
title | Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction |
title_full | Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction |
title_fullStr | Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction |
title_full_unstemmed | Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction |
title_short | Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction |
title_sort | auto-generated materials database of curie and néel temperatures via semi-supervised relationship extraction |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6007086/ https://www.ncbi.nlm.nih.gov/pubmed/29917013 http://dx.doi.org/10.1038/sdata.2018.111 |
work_keys_str_mv | AT courtcallumj autogeneratedmaterialsdatabaseofcurieandneeltemperaturesviasemisupervisedrelationshipextraction AT colejacquelinem autogeneratedmaterialsdatabaseofcurieandneeltemperaturesviasemisupervisedrelationshipextraction |