Cargando…
The OpenScience Slovenia metadata dataset
The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6928342/ https://www.ncbi.nlm.nih.gov/pubmed/31890793 http://dx.doi.org/10.1016/j.dib.2019.104942 |
_version_ | 1783482465029980160 |
---|---|
author | Borovič, Mladen Ferme, Marko Brezovnik, Janez Majninger, Sandi Bregant, Albin Hrovat, Goran Ojsteršek, Milan |
author_facet | Borovič, Mladen Ferme, Marko Brezovnik, Janez Majninger, Sandi Bregant, Albin Hrovat, Goran Ojsteršek, Milan |
author_sort | Borovič, Mladen |
collection | PubMed |
description | The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of the establishment of the Slovenian Open-Access Infrastructure which defined a unified document collection process and cataloguing for universities in Slovenia within the infrastructure repositories. The data was collected from several already established but separate library systems in Slovenia and merged into a single metadata scheme using metadata deduplication and merging techniques. It consists of text and numerical fields, representing attributes that describe documents. These attributes include document titles, keywords, abstracts, typologies, authors, issue years and other identifiers such as URL and UDC. The potential of this dataset lies especially in text mining and text classification tasks and can also be used in development or benchmarking of content-based recommender systems on real-world data. |
format | Online Article Text |
id | pubmed-6928342 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-69283422019-12-30 The OpenScience Slovenia metadata dataset Borovič, Mladen Ferme, Marko Brezovnik, Janez Majninger, Sandi Bregant, Albin Hrovat, Goran Ojsteršek, Milan Data Brief Computer Science The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of the establishment of the Slovenian Open-Access Infrastructure which defined a unified document collection process and cataloguing for universities in Slovenia within the infrastructure repositories. The data was collected from several already established but separate library systems in Slovenia and merged into a single metadata scheme using metadata deduplication and merging techniques. It consists of text and numerical fields, representing attributes that describe documents. These attributes include document titles, keywords, abstracts, typologies, authors, issue years and other identifiers such as URL and UDC. The potential of this dataset lies especially in text mining and text classification tasks and can also be used in development or benchmarking of content-based recommender systems on real-world data. Elsevier 2019-12-05 /pmc/articles/PMC6928342/ /pubmed/31890793 http://dx.doi.org/10.1016/j.dib.2019.104942 Text en © 2019 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Computer Science Borovič, Mladen Ferme, Marko Brezovnik, Janez Majninger, Sandi Bregant, Albin Hrovat, Goran Ojsteršek, Milan The OpenScience Slovenia metadata dataset |
title | The OpenScience Slovenia metadata dataset |
title_full | The OpenScience Slovenia metadata dataset |
title_fullStr | The OpenScience Slovenia metadata dataset |
title_full_unstemmed | The OpenScience Slovenia metadata dataset |
title_short | The OpenScience Slovenia metadata dataset |
title_sort | openscience slovenia metadata dataset |
topic | Computer Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6928342/ https://www.ncbi.nlm.nih.gov/pubmed/31890793 http://dx.doi.org/10.1016/j.dib.2019.104942 |
work_keys_str_mv | AT borovicmladen theopensciencesloveniametadatadataset AT fermemarko theopensciencesloveniametadatadataset AT brezovnikjanez theopensciencesloveniametadatadataset AT majningersandi theopensciencesloveniametadatadataset AT bregantalbin theopensciencesloveniametadatadataset AT hrovatgoran theopensciencesloveniametadatadataset AT ojstersekmilan theopensciencesloveniametadatadataset AT borovicmladen opensciencesloveniametadatadataset AT fermemarko opensciencesloveniametadatadataset AT brezovnikjanez opensciencesloveniametadatadataset AT majningersandi opensciencesloveniametadatadataset AT bregantalbin opensciencesloveniametadatadataset AT hrovatgoran opensciencesloveniametadatadataset AT ojstersekmilan opensciencesloveniametadatadataset |