Cargando…

2D geometric shapes dataset – for machine learning and pattern recognition

In this paper, we present a data article that describes a dataset of nine 2D geometric shapes, and each shape is drawn randomly on a 200 × 200 RGB image. During the generation of this dataset, the perimeter and the position of each shape are selected randomly and independently for each image. The ro...

Descripción completa

Detalles Bibliográficos
Autores principales: Korchi, Anas El, Ghanou, Youssef
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7397396/
https://www.ncbi.nlm.nih.gov/pubmed/32775584
http://dx.doi.org/10.1016/j.dib.2020.106090
_version_ 1783565770756718592
author Korchi, Anas El
Ghanou, Youssef
author_facet Korchi, Anas El
Ghanou, Youssef
author_sort Korchi, Anas El
collection PubMed
description In this paper, we present a data article that describes a dataset of nine 2D geometric shapes, and each shape is drawn randomly on a 200 × 200 RGB image. During the generation of this dataset, the perimeter and the position of each shape are selected randomly and independently for each image. The rotation angle of each shape is chosen randomly for each image within an interval between -180° and 180°, as well as the background colour of each image and the filling colour of each shape are selected randomly and independently. The published dataset is composed of 9 classes of data, and each class represent a type of geometric shape (Triangle, Square, Pentagon, Hexagon, Heptagon, Octagon, Nonagon, Circle and Star). Each class is composed of 10k generated images. This paper also includes a GitHub URL to the generator source code used for the generation, which can be reused to generate any desired size of data. The proposed dataset aims to provide a perfectly clean dataset, for classification as well as clustering purposes. The fact that this dataset is generated synthetically provides the ability to use it to study the behaviour of machine learning models independently of the nature of the dataset or the possible noise or data leak that can be found in any other datasets. Moreover, the choice of a 2D geometrical shape dataset provides the ability to understand as well to have good knowledge of the number of patterns stored inside each data class.
format Online
Article
Text
id pubmed-7397396
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-73973962020-08-06 2D geometric shapes dataset – for machine learning and pattern recognition Korchi, Anas El Ghanou, Youssef Data Brief Computer Science In this paper, we present a data article that describes a dataset of nine 2D geometric shapes, and each shape is drawn randomly on a 200 × 200 RGB image. During the generation of this dataset, the perimeter and the position of each shape are selected randomly and independently for each image. The rotation angle of each shape is chosen randomly for each image within an interval between -180° and 180°, as well as the background colour of each image and the filling colour of each shape are selected randomly and independently. The published dataset is composed of 9 classes of data, and each class represent a type of geometric shape (Triangle, Square, Pentagon, Hexagon, Heptagon, Octagon, Nonagon, Circle and Star). Each class is composed of 10k generated images. This paper also includes a GitHub URL to the generator source code used for the generation, which can be reused to generate any desired size of data. The proposed dataset aims to provide a perfectly clean dataset, for classification as well as clustering purposes. The fact that this dataset is generated synthetically provides the ability to use it to study the behaviour of machine learning models independently of the nature of the dataset or the possible noise or data leak that can be found in any other datasets. Moreover, the choice of a 2D geometrical shape dataset provides the ability to understand as well to have good knowledge of the number of patterns stored inside each data class. Elsevier 2020-07-25 /pmc/articles/PMC7397396/ /pubmed/32775584 http://dx.doi.org/10.1016/j.dib.2020.106090 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Computer Science
Korchi, Anas El
Ghanou, Youssef
2D geometric shapes dataset – for machine learning and pattern recognition
title 2D geometric shapes dataset – for machine learning and pattern recognition
title_full 2D geometric shapes dataset – for machine learning and pattern recognition
title_fullStr 2D geometric shapes dataset – for machine learning and pattern recognition
title_full_unstemmed 2D geometric shapes dataset – for machine learning and pattern recognition
title_short 2D geometric shapes dataset – for machine learning and pattern recognition
title_sort 2d geometric shapes dataset – for machine learning and pattern recognition
topic Computer Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7397396/
https://www.ncbi.nlm.nih.gov/pubmed/32775584
http://dx.doi.org/10.1016/j.dib.2020.106090
work_keys_str_mv AT korchianasel 2dgeometricshapesdatasetformachinelearningandpatternrecognition
AT ghanouyoussef 2dgeometricshapesdatasetformachinelearningandpatternrecognition