Cargando…
A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments
The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequ...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9860036/ https://www.ncbi.nlm.nih.gov/pubmed/36670109 http://dx.doi.org/10.1038/s41597-023-01929-2 |
_version_ | 1784874484036009984 |
---|---|
author | Buscombe, Daniel Wernette, Phillipe Fitzpatrick, Sharon Favela, Jaycee Goldstein, Evan B. Enwright, Nicholas M. |
author_facet | Buscombe, Daniel Wernette, Phillipe Fitzpatrick, Sharon Favela, Jaycee Goldstein, Evan B. Enwright, Nicholas M. |
author_sort | Buscombe, Daniel |
collection | PubMed |
description | The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequent observation from imaging sensors on remote sensing platforms. Machine Learning models that carry out supervised (i.e., human-guided) pixel-based classification, or image segmentation, have transformative applications in spatio-temporal mapping of dynamic environments, including transient coastal landforms, sediments, habitats, waterbodies, and water flows. However, these models require large and well-documented training and testing datasets consisting of labeled imagery. We describe “Coast Train,” a multi-labeler dataset of orthomosaic and satellite images of coastal environments and corresponding labels. These data include imagery that are diverse in space and time, and contain 1.2 billion labeled pixels, representing over 3.6 million hectares. We use a human-in-the-loop tool especially designed for rapid and reproducible Earth surface image segmentation. Our approach permits image labeling by multiple labelers, in turn enabling quantification of pixel-level agreement over individual and collections of images. |
format | Online Article Text |
id | pubmed-9860036 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-98600362023-01-22 A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments Buscombe, Daniel Wernette, Phillipe Fitzpatrick, Sharon Favela, Jaycee Goldstein, Evan B. Enwright, Nicholas M. Sci Data Data Descriptor The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequent observation from imaging sensors on remote sensing platforms. Machine Learning models that carry out supervised (i.e., human-guided) pixel-based classification, or image segmentation, have transformative applications in spatio-temporal mapping of dynamic environments, including transient coastal landforms, sediments, habitats, waterbodies, and water flows. However, these models require large and well-documented training and testing datasets consisting of labeled imagery. We describe “Coast Train,” a multi-labeler dataset of orthomosaic and satellite images of coastal environments and corresponding labels. These data include imagery that are diverse in space and time, and contain 1.2 billion labeled pixels, representing over 3.6 million hectares. We use a human-in-the-loop tool especially designed for rapid and reproducible Earth surface image segmentation. Our approach permits image labeling by multiple labelers, in turn enabling quantification of pixel-level agreement over individual and collections of images. Nature Publishing Group UK 2023-01-20 /pmc/articles/PMC9860036/ /pubmed/36670109 http://dx.doi.org/10.1038/s41597-023-01929-2 Text en © This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Data Descriptor Buscombe, Daniel Wernette, Phillipe Fitzpatrick, Sharon Favela, Jaycee Goldstein, Evan B. Enwright, Nicholas M. A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments |
title | A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments |
title_full | A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments |
title_fullStr | A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments |
title_full_unstemmed | A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments |
title_short | A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments |
title_sort | 1.2 billion pixel human-labeled dataset for data-driven classification of coastal environments |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9860036/ https://www.ncbi.nlm.nih.gov/pubmed/36670109 http://dx.doi.org/10.1038/s41597-023-01929-2 |
work_keys_str_mv | AT buscombedaniel a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT wernettephillipe a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT fitzpatricksharon a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT favelajaycee a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT goldsteinevanb a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT enwrightnicholasm a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT buscombedaniel 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT wernettephillipe 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT fitzpatricksharon 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT favelajaycee 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT goldsteinevanb 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments AT enwrightnicholasm 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments |