Cargando…
Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning
Arsenic, a potent carcinogen and neurotoxin, affects over 200 million people globally. Current detection methods are laborious, expensive, and unscalable, being difficult to implement in developing regions and during crises such as COVID-19. This study attempts to determine if a relationship exists...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8707206/ https://www.ncbi.nlm.nih.gov/pubmed/34941767 http://dx.doi.org/10.3390/toxics9120333 |
_version_ | 1784622380335759360 |
---|---|
author | Agrawal, Ayush Petersen, Mark R. |
author_facet | Agrawal, Ayush Petersen, Mark R. |
author_sort | Agrawal, Ayush |
collection | PubMed |
description | Arsenic, a potent carcinogen and neurotoxin, affects over 200 million people globally. Current detection methods are laborious, expensive, and unscalable, being difficult to implement in developing regions and during crises such as COVID-19. This study attempts to determine if a relationship exists between soil’s hyperspectral data and arsenic concentration using NASA’s Hyperion satellite. It is the first arsenic study to use satellite-based hyperspectral data and apply a classification approach. Four regression machine learning models are tested to determine this correlation in soil with bare land cover. Raw data are converted to reflectance, problematic atmospheric influences are removed, characteristic wavelengths are selected, and four noise reduction algorithms are tested. The combination of data augmentation, Genetic Algorithm, Second Derivative Transformation, and Random Forest regression ([Formula: see text] and normalized root mean squared error (re-scaled to [0,1]) = [Formula: see text]) shows strong correlation, performing better than past models despite using noisier satellite data (versus lab-processed samples). Three binary classification machine learning models are then applied to identify high-risk shrub-covered regions in ten U.S. states, achieving strong accuracy (=0.693) and F1-score (=0.728). Overall, these results suggest that such a methodology is practical and can provide a sustainable alternative to arsenic contamination detection. |
format | Online Article Text |
id | pubmed-8707206 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-87072062021-12-25 Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning Agrawal, Ayush Petersen, Mark R. Toxics Article Arsenic, a potent carcinogen and neurotoxin, affects over 200 million people globally. Current detection methods are laborious, expensive, and unscalable, being difficult to implement in developing regions and during crises such as COVID-19. This study attempts to determine if a relationship exists between soil’s hyperspectral data and arsenic concentration using NASA’s Hyperion satellite. It is the first arsenic study to use satellite-based hyperspectral data and apply a classification approach. Four regression machine learning models are tested to determine this correlation in soil with bare land cover. Raw data are converted to reflectance, problematic atmospheric influences are removed, characteristic wavelengths are selected, and four noise reduction algorithms are tested. The combination of data augmentation, Genetic Algorithm, Second Derivative Transformation, and Random Forest regression ([Formula: see text] and normalized root mean squared error (re-scaled to [0,1]) = [Formula: see text]) shows strong correlation, performing better than past models despite using noisier satellite data (versus lab-processed samples). Three binary classification machine learning models are then applied to identify high-risk shrub-covered regions in ten U.S. states, achieving strong accuracy (=0.693) and F1-score (=0.728). Overall, these results suggest that such a methodology is practical and can provide a sustainable alternative to arsenic contamination detection. MDPI 2021-12-03 /pmc/articles/PMC8707206/ /pubmed/34941767 http://dx.doi.org/10.3390/toxics9120333 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Agrawal, Ayush Petersen, Mark R. Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning |
title | Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning |
title_full | Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning |
title_fullStr | Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning |
title_full_unstemmed | Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning |
title_short | Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning |
title_sort | detecting arsenic contamination using satellite imagery and machine learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8707206/ https://www.ncbi.nlm.nih.gov/pubmed/34941767 http://dx.doi.org/10.3390/toxics9120333 |
work_keys_str_mv | AT agrawalayush detectingarseniccontaminationusingsatelliteimageryandmachinelearning AT petersenmarkr detectingarseniccontaminationusingsatelliteimageryandmachinelearning |