Cargando…

A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments

The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequ...

Descripción completa

Detalles Bibliográficos
Autores principales: Buscombe, Daniel, Wernette, Phillipe, Fitzpatrick, Sharon, Favela, Jaycee, Goldstein, Evan B., Enwright, Nicholas M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9860036/
https://www.ncbi.nlm.nih.gov/pubmed/36670109
http://dx.doi.org/10.1038/s41597-023-01929-2
_version_ 1784874484036009984
author Buscombe, Daniel
Wernette, Phillipe
Fitzpatrick, Sharon
Favela, Jaycee
Goldstein, Evan B.
Enwright, Nicholas M.
author_facet Buscombe, Daniel
Wernette, Phillipe
Fitzpatrick, Sharon
Favela, Jaycee
Goldstein, Evan B.
Enwright, Nicholas M.
author_sort Buscombe, Daniel
collection PubMed
description The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequent observation from imaging sensors on remote sensing platforms. Machine Learning models that carry out supervised (i.e., human-guided) pixel-based classification, or image segmentation, have transformative applications in spatio-temporal mapping of dynamic environments, including transient coastal landforms, sediments, habitats, waterbodies, and water flows. However, these models require large and well-documented training and testing datasets consisting of labeled imagery. We describe “Coast Train,” a multi-labeler dataset of orthomosaic and satellite images of coastal environments and corresponding labels. These data include imagery that are diverse in space and time, and contain 1.2 billion labeled pixels, representing over 3.6 million hectares. We use a human-in-the-loop tool especially designed for rapid and reproducible Earth surface image segmentation. Our approach permits image labeling by multiple labelers, in turn enabling quantification of pixel-level agreement over individual and collections of images.
format Online
Article
Text
id pubmed-9860036
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-98600362023-01-22 A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments Buscombe, Daniel Wernette, Phillipe Fitzpatrick, Sharon Favela, Jaycee Goldstein, Evan B. Enwright, Nicholas M. Sci Data Data Descriptor The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequent observation from imaging sensors on remote sensing platforms. Machine Learning models that carry out supervised (i.e., human-guided) pixel-based classification, or image segmentation, have transformative applications in spatio-temporal mapping of dynamic environments, including transient coastal landforms, sediments, habitats, waterbodies, and water flows. However, these models require large and well-documented training and testing datasets consisting of labeled imagery. We describe “Coast Train,” a multi-labeler dataset of orthomosaic and satellite images of coastal environments and corresponding labels. These data include imagery that are diverse in space and time, and contain 1.2 billion labeled pixels, representing over 3.6 million hectares. We use a human-in-the-loop tool especially designed for rapid and reproducible Earth surface image segmentation. Our approach permits image labeling by multiple labelers, in turn enabling quantification of pixel-level agreement over individual and collections of images. Nature Publishing Group UK 2023-01-20 /pmc/articles/PMC9860036/ /pubmed/36670109 http://dx.doi.org/10.1038/s41597-023-01929-2 Text en © This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Data Descriptor
Buscombe, Daniel
Wernette, Phillipe
Fitzpatrick, Sharon
Favela, Jaycee
Goldstein, Evan B.
Enwright, Nicholas M.
A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments
title A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments
title_full A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments
title_fullStr A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments
title_full_unstemmed A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments
title_short A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments
title_sort 1.2 billion pixel human-labeled dataset for data-driven classification of coastal environments
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9860036/
https://www.ncbi.nlm.nih.gov/pubmed/36670109
http://dx.doi.org/10.1038/s41597-023-01929-2
work_keys_str_mv AT buscombedaniel a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT wernettephillipe a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT fitzpatricksharon a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT favelajaycee a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT goldsteinevanb a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT enwrightnicholasm a12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT buscombedaniel 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT wernettephillipe 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT fitzpatricksharon 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT favelajaycee 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT goldsteinevanb 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments
AT enwrightnicholasm 12billionpixelhumanlabeleddatasetfordatadrivenclassificationofcoastalenvironments