Cargando…

The OpenScience Slovenia metadata dataset

The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of...

Descripción completa

Detalles Bibliográficos
Autores principales: Borovič, Mladen, Ferme, Marko, Brezovnik, Janez, Majninger, Sandi, Bregant, Albin, Hrovat, Goran, Ojsteršek, Milan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6928342/
https://www.ncbi.nlm.nih.gov/pubmed/31890793
http://dx.doi.org/10.1016/j.dib.2019.104942
_version_ 1783482465029980160
author Borovič, Mladen
Ferme, Marko
Brezovnik, Janez
Majninger, Sandi
Bregant, Albin
Hrovat, Goran
Ojsteršek, Milan
author_facet Borovič, Mladen
Ferme, Marko
Brezovnik, Janez
Majninger, Sandi
Bregant, Albin
Hrovat, Goran
Ojsteršek, Milan
author_sort Borovič, Mladen
collection PubMed
description The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of the establishment of the Slovenian Open-Access Infrastructure which defined a unified document collection process and cataloguing for universities in Slovenia within the infrastructure repositories. The data was collected from several already established but separate library systems in Slovenia and merged into a single metadata scheme using metadata deduplication and merging techniques. It consists of text and numerical fields, representing attributes that describe documents. These attributes include document titles, keywords, abstracts, typologies, authors, issue years and other identifiers such as URL and UDC. The potential of this dataset lies especially in text mining and text classification tasks and can also be used in development or benchmarking of content-based recommender systems on real-world data.
format Online
Article
Text
id pubmed-6928342
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-69283422019-12-30 The OpenScience Slovenia metadata dataset Borovič, Mladen Ferme, Marko Brezovnik, Janez Majninger, Sandi Bregant, Albin Hrovat, Goran Ojsteršek, Milan Data Brief Computer Science The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of the establishment of the Slovenian Open-Access Infrastructure which defined a unified document collection process and cataloguing for universities in Slovenia within the infrastructure repositories. The data was collected from several already established but separate library systems in Slovenia and merged into a single metadata scheme using metadata deduplication and merging techniques. It consists of text and numerical fields, representing attributes that describe documents. These attributes include document titles, keywords, abstracts, typologies, authors, issue years and other identifiers such as URL and UDC. The potential of this dataset lies especially in text mining and text classification tasks and can also be used in development or benchmarking of content-based recommender systems on real-world data. Elsevier 2019-12-05 /pmc/articles/PMC6928342/ /pubmed/31890793 http://dx.doi.org/10.1016/j.dib.2019.104942 Text en © 2019 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Computer Science
Borovič, Mladen
Ferme, Marko
Brezovnik, Janez
Majninger, Sandi
Bregant, Albin
Hrovat, Goran
Ojsteršek, Milan
The OpenScience Slovenia metadata dataset
title The OpenScience Slovenia metadata dataset
title_full The OpenScience Slovenia metadata dataset
title_fullStr The OpenScience Slovenia metadata dataset
title_full_unstemmed The OpenScience Slovenia metadata dataset
title_short The OpenScience Slovenia metadata dataset
title_sort openscience slovenia metadata dataset
topic Computer Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6928342/
https://www.ncbi.nlm.nih.gov/pubmed/31890793
http://dx.doi.org/10.1016/j.dib.2019.104942
work_keys_str_mv AT borovicmladen theopensciencesloveniametadatadataset
AT fermemarko theopensciencesloveniametadatadataset
AT brezovnikjanez theopensciencesloveniametadatadataset
AT majningersandi theopensciencesloveniametadatadataset
AT bregantalbin theopensciencesloveniametadatadataset
AT hrovatgoran theopensciencesloveniametadatadataset
AT ojstersekmilan theopensciencesloveniametadatadataset
AT borovicmladen opensciencesloveniametadatadataset
AT fermemarko opensciencesloveniametadatadataset
AT brezovnikjanez opensciencesloveniametadatadataset
AT majningersandi opensciencesloveniametadatadataset
AT bregantalbin opensciencesloveniametadatadataset
AT hrovatgoran opensciencesloveniametadatadataset
AT ojstersekmilan opensciencesloveniametadatadataset