Cargando…
The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions
Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available datasets of dermatoscopic images. We tackle this problem by releasing the HAM10000 (“Human Against Machine with 10000 training images”) dataset. We collected...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6091241/ https://www.ncbi.nlm.nih.gov/pubmed/30106392 http://dx.doi.org/10.1038/sdata.2018.161 |
_version_ | 1783347357015867392 |
---|---|
author | Tschandl, Philipp Rosendahl, Cliff Kittler, Harald |
author_facet | Tschandl, Philipp Rosendahl, Cliff Kittler, Harald |
author_sort | Tschandl, Philipp |
collection | PubMed |
description | Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available datasets of dermatoscopic images. We tackle this problem by releasing the HAM10000 (“Human Against Machine with 10000 training images”) dataset. We collected dermatoscopic images from different populations acquired and stored by different modalities. Given this diversity we had to apply different acquisition and cleaning methods and developed semi-automatic workflows utilizing specifically trained neural networks. The final dataset consists of 10015 dermatoscopic images which are released as a training set for academic machine learning purposes and are publicly available through the ISIC archive. This benchmark dataset can be used for machine learning and for comparisons with human experts. Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions. More than 50% of lesions have been confirmed by pathology, while the ground truth for the rest of the cases was either follow-up, expert consensus, or confirmation by in-vivo confocal microscopy. |
format | Online Article Text |
id | pubmed-6091241 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-60912412018-08-24 The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions Tschandl, Philipp Rosendahl, Cliff Kittler, Harald Sci Data Data Descriptor Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available datasets of dermatoscopic images. We tackle this problem by releasing the HAM10000 (“Human Against Machine with 10000 training images”) dataset. We collected dermatoscopic images from different populations acquired and stored by different modalities. Given this diversity we had to apply different acquisition and cleaning methods and developed semi-automatic workflows utilizing specifically trained neural networks. The final dataset consists of 10015 dermatoscopic images which are released as a training set for academic machine learning purposes and are publicly available through the ISIC archive. This benchmark dataset can be used for machine learning and for comparisons with human experts. Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions. More than 50% of lesions have been confirmed by pathology, while the ground truth for the rest of the cases was either follow-up, expert consensus, or confirmation by in-vivo confocal microscopy. Nature Publishing Group 2018-08-14 /pmc/articles/PMC6091241/ /pubmed/30106392 http://dx.doi.org/10.1038/sdata.2018.161 Text en Copyright © 2018, The Author(s) http://creativecommons.org/licenses/by/4.0/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article. |
spellingShingle | Data Descriptor Tschandl, Philipp Rosendahl, Cliff Kittler, Harald The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions |
title | The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions |
title_full | The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions |
title_fullStr | The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions |
title_full_unstemmed | The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions |
title_short | The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions |
title_sort | ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6091241/ https://www.ncbi.nlm.nih.gov/pubmed/30106392 http://dx.doi.org/10.1038/sdata.2018.161 |
work_keys_str_mv | AT tschandlphilipp theham10000datasetalargecollectionofmultisourcedermatoscopicimagesofcommonpigmentedskinlesions AT rosendahlcliff theham10000datasetalargecollectionofmultisourcedermatoscopicimagesofcommonpigmentedskinlesions AT kittlerharald theham10000datasetalargecollectionofmultisourcedermatoscopicimagesofcommonpigmentedskinlesions AT tschandlphilipp ham10000datasetalargecollectionofmultisourcedermatoscopicimagesofcommonpigmentedskinlesions AT rosendahlcliff ham10000datasetalargecollectionofmultisourcedermatoscopicimagesofcommonpigmentedskinlesions AT kittlerharald ham10000datasetalargecollectionofmultisourcedermatoscopicimagesofcommonpigmentedskinlesions |