Cargando…

A Machine Learning‐Based Approach to Clinopyroxene Thermobarometry: Model Optimization and Distribution for Use in Earth Sciences

Thermobarometry is a fundamental tool to quantitatively interrogate magma plumbing systems and broaden our appreciation of volcanic processes. Developments in random forest‐based machine learning lend themselves to a data‐driven approach to clinopyroxene thermobarometry, allowing users to access lar...

Descripción completa

Detalles Bibliográficos
Autores principales: Jorgenson, C., Higgins, O., Petrelli, M., Bégué, F., Caricchi, L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9285709/
https://www.ncbi.nlm.nih.gov/pubmed/35860374
http://dx.doi.org/10.1029/2021JB022904
_version_ 1784747842231861248
author Jorgenson, C.
Higgins, O.
Petrelli, M.
Bégué, F.
Caricchi, L.
author_facet Jorgenson, C.
Higgins, O.
Petrelli, M.
Bégué, F.
Caricchi, L.
author_sort Jorgenson, C.
collection PubMed
description Thermobarometry is a fundamental tool to quantitatively interrogate magma plumbing systems and broaden our appreciation of volcanic processes. Developments in random forest‐based machine learning lend themselves to a data‐driven approach to clinopyroxene thermobarometry, allowing users to access large experimental data sets that can be tailored to individual applications in Earth Sciences. We present a methodological assessment of random forest thermobarometry using the R freeware package extraTrees. We investigate the model performance, the effect of hyperparameter tuning, and assess different methods for calculating uncertainties. Deviating from the default hyperparameters used in the extraTrees package results in little difference in overall model performance (<0.2 kbar and <3°C difference in standard error estimate, SEE). However, accuracy is greatly affected by how the final value from the distribution of trees in the random forest is selected (mean, median, or mode). Using the mean value leads to higher residuals between experimental and predicted P and T, whereas using median values produces smaller residuals. Additionally, this work provides two scripts for users to apply the methodology to natural data sets. The first script permits modification and filtering of the model calibration data set. The second script contains premade models, where users can rapidly input their data to recover PT estimates (SEE clinopyroxene‐only model: 3.2 kbar, 72.5°C and liquid‐clinopyroxene model: 2.7 kbar, 44.9°C). Additionally, the scripts allow the user to estimate the uncertainty for each analysis, which in some cases is significantly smaller than the reported SEE. These scripts are open source and can be accessed at https://github.com/corinjorgenson/RandomForest-cpx-thermobarometer.
format Online
Article
Text
id pubmed-9285709
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-92857092022-07-18 A Machine Learning‐Based Approach to Clinopyroxene Thermobarometry: Model Optimization and Distribution for Use in Earth Sciences Jorgenson, C. Higgins, O. Petrelli, M. Bégué, F. Caricchi, L. J Geophys Res Solid Earth Research Article Thermobarometry is a fundamental tool to quantitatively interrogate magma plumbing systems and broaden our appreciation of volcanic processes. Developments in random forest‐based machine learning lend themselves to a data‐driven approach to clinopyroxene thermobarometry, allowing users to access large experimental data sets that can be tailored to individual applications in Earth Sciences. We present a methodological assessment of random forest thermobarometry using the R freeware package extraTrees. We investigate the model performance, the effect of hyperparameter tuning, and assess different methods for calculating uncertainties. Deviating from the default hyperparameters used in the extraTrees package results in little difference in overall model performance (<0.2 kbar and <3°C difference in standard error estimate, SEE). However, accuracy is greatly affected by how the final value from the distribution of trees in the random forest is selected (mean, median, or mode). Using the mean value leads to higher residuals between experimental and predicted P and T, whereas using median values produces smaller residuals. Additionally, this work provides two scripts for users to apply the methodology to natural data sets. The first script permits modification and filtering of the model calibration data set. The second script contains premade models, where users can rapidly input their data to recover PT estimates (SEE clinopyroxene‐only model: 3.2 kbar, 72.5°C and liquid‐clinopyroxene model: 2.7 kbar, 44.9°C). Additionally, the scripts allow the user to estimate the uncertainty for each analysis, which in some cases is significantly smaller than the reported SEE. These scripts are open source and can be accessed at https://github.com/corinjorgenson/RandomForest-cpx-thermobarometer. John Wiley and Sons Inc. 2022-04-09 2022-04 /pmc/articles/PMC9285709/ /pubmed/35860374 http://dx.doi.org/10.1029/2021JB022904 Text en © 2022. The Authors. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Jorgenson, C.
Higgins, O.
Petrelli, M.
Bégué, F.
Caricchi, L.
A Machine Learning‐Based Approach to Clinopyroxene Thermobarometry: Model Optimization and Distribution for Use in Earth Sciences
title A Machine Learning‐Based Approach to Clinopyroxene Thermobarometry: Model Optimization and Distribution for Use in Earth Sciences
title_full A Machine Learning‐Based Approach to Clinopyroxene Thermobarometry: Model Optimization and Distribution for Use in Earth Sciences
title_fullStr A Machine Learning‐Based Approach to Clinopyroxene Thermobarometry: Model Optimization and Distribution for Use in Earth Sciences
title_full_unstemmed A Machine Learning‐Based Approach to Clinopyroxene Thermobarometry: Model Optimization and Distribution for Use in Earth Sciences
title_short A Machine Learning‐Based Approach to Clinopyroxene Thermobarometry: Model Optimization and Distribution for Use in Earth Sciences
title_sort machine learning‐based approach to clinopyroxene thermobarometry: model optimization and distribution for use in earth sciences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9285709/
https://www.ncbi.nlm.nih.gov/pubmed/35860374
http://dx.doi.org/10.1029/2021JB022904
work_keys_str_mv AT jorgensonc amachinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT higginso amachinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT petrellim amachinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT beguef amachinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT caricchil amachinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT jorgensonc machinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT higginso machinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT petrellim machinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT beguef machinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences
AT caricchil machinelearningbasedapproachtoclinopyroxenethermobarometrymodeloptimizationanddistributionforuseinearthsciences