Cargando…

Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning

Arsenic, a potent carcinogen and neurotoxin, affects over 200 million people globally. Current detection methods are laborious, expensive, and unscalable, being difficult to implement in developing regions and during crises such as COVID-19. This study attempts to determine if a relationship exists...

Descripción completa

Detalles Bibliográficos
Autores principales: Agrawal, Ayush, Petersen, Mark R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8707206/
https://www.ncbi.nlm.nih.gov/pubmed/34941767
http://dx.doi.org/10.3390/toxics9120333
_version_ 1784622380335759360
author Agrawal, Ayush
Petersen, Mark R.
author_facet Agrawal, Ayush
Petersen, Mark R.
author_sort Agrawal, Ayush
collection PubMed
description Arsenic, a potent carcinogen and neurotoxin, affects over 200 million people globally. Current detection methods are laborious, expensive, and unscalable, being difficult to implement in developing regions and during crises such as COVID-19. This study attempts to determine if a relationship exists between soil’s hyperspectral data and arsenic concentration using NASA’s Hyperion satellite. It is the first arsenic study to use satellite-based hyperspectral data and apply a classification approach. Four regression machine learning models are tested to determine this correlation in soil with bare land cover. Raw data are converted to reflectance, problematic atmospheric influences are removed, characteristic wavelengths are selected, and four noise reduction algorithms are tested. The combination of data augmentation, Genetic Algorithm, Second Derivative Transformation, and Random Forest regression ([Formula: see text] and normalized root mean squared error (re-scaled to [0,1]) = [Formula: see text]) shows strong correlation, performing better than past models despite using noisier satellite data (versus lab-processed samples). Three binary classification machine learning models are then applied to identify high-risk shrub-covered regions in ten U.S. states, achieving strong accuracy (=0.693) and F1-score (=0.728). Overall, these results suggest that such a methodology is practical and can provide a sustainable alternative to arsenic contamination detection.
format Online
Article
Text
id pubmed-8707206
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-87072062021-12-25 Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning Agrawal, Ayush Petersen, Mark R. Toxics Article Arsenic, a potent carcinogen and neurotoxin, affects over 200 million people globally. Current detection methods are laborious, expensive, and unscalable, being difficult to implement in developing regions and during crises such as COVID-19. This study attempts to determine if a relationship exists between soil’s hyperspectral data and arsenic concentration using NASA’s Hyperion satellite. It is the first arsenic study to use satellite-based hyperspectral data and apply a classification approach. Four regression machine learning models are tested to determine this correlation in soil with bare land cover. Raw data are converted to reflectance, problematic atmospheric influences are removed, characteristic wavelengths are selected, and four noise reduction algorithms are tested. The combination of data augmentation, Genetic Algorithm, Second Derivative Transformation, and Random Forest regression ([Formula: see text] and normalized root mean squared error (re-scaled to [0,1]) = [Formula: see text]) shows strong correlation, performing better than past models despite using noisier satellite data (versus lab-processed samples). Three binary classification machine learning models are then applied to identify high-risk shrub-covered regions in ten U.S. states, achieving strong accuracy (=0.693) and F1-score (=0.728). Overall, these results suggest that such a methodology is practical and can provide a sustainable alternative to arsenic contamination detection. MDPI 2021-12-03 /pmc/articles/PMC8707206/ /pubmed/34941767 http://dx.doi.org/10.3390/toxics9120333 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Agrawal, Ayush
Petersen, Mark R.
Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning
title Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning
title_full Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning
title_fullStr Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning
title_full_unstemmed Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning
title_short Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning
title_sort detecting arsenic contamination using satellite imagery and machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8707206/
https://www.ncbi.nlm.nih.gov/pubmed/34941767
http://dx.doi.org/10.3390/toxics9120333
work_keys_str_mv AT agrawalayush detectingarseniccontaminationusingsatelliteimageryandmachinelearning
AT petersenmarkr detectingarseniccontaminationusingsatelliteimageryandmachinelearning