Cargando…

CloudSEN12, a global dataset for semantic understanding of cloud and cloud shadow in Sentinel-2

Accurately characterizing clouds and their shadows is a long-standing problem in the Earth Observation community. Recent works showcase the necessity to improve cloud detection methods for imagery acquired by the Sentinel-2 satellites. However, the lack of consensus and transparency in existing refe...

Descripción completa

Detalles Bibliográficos
Autores principales: Aybar, Cesar, Ysuhuaylas, Luis, Loja, Jhomira, Gonzales, Karen, Herrera, Fernando, Bautista, Lesly, Yali, Roy, Flores, Angie, Diaz, Lissette, Cuenca, Nicole, Espinoza, Wendy, Prudencio, Fernando, Llactayo, Valeria, Montero, David, Sudmanns, Martin, Tiede, Dirk, Mateo-García, Gonzalo, Gómez-Chova, Luis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789947/
https://www.ncbi.nlm.nih.gov/pubmed/36566333
http://dx.doi.org/10.1038/s41597-022-01878-2
Descripción
Sumario:Accurately characterizing clouds and their shadows is a long-standing problem in the Earth Observation community. Recent works showcase the necessity to improve cloud detection methods for imagery acquired by the Sentinel-2 satellites. However, the lack of consensus and transparency in existing reference datasets hampers the benchmarking of current cloud detection methods. Exploiting the analysis-ready data offered by the Copernicus program, we created CloudSEN12, a new multi-temporal global dataset to foster research in cloud and cloud shadow detection. CloudSEN12 has 49,400 image patches, including (1) Sentinel-2 level-1C and level-2A multi-spectral data, (2) Sentinel-1 synthetic aperture radar data, (3) auxiliary remote sensing products, (4) different hand-crafted annotations to label the presence of thick and thin clouds and cloud shadows, and (5) the results from eight state-of-the-art cloud detection algorithms. At present, CloudSEN12 exceeds all previous efforts in terms of annotation richness, scene variability, geographic distribution, metadata complexity, quality control, and number of samples.