Cargando…
Things you can do dumping your Invenio database into a flat file
<!--HTML-->Invenio database design and interfaces are optimized for fast end user search and retrieval. As administrators, we can add indexes at will and use them via web or API. However, many maintenance tasks are not well covered with those indexes. For most of those cases, reading the re...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2259672 |
_version_ | 1780953948068249600 |
---|---|
author | Jorba, Ferran |
author_facet | Jorba, Ferran |
author_sort | Jorba, Ferran |
collection | CERN |
description | <!--HTML-->Invenio database design and interfaces are optimized for fast end user
search and retrieval. As administrators, we can add indexes at will
and use them via web or API. However, many maintenance tasks are not
well covered with those indexes.
For most of those cases, reading the records sequentialy is the
optimal solution. However, if the database is large enough, reading
them via Invenio API may take hours, while the system slows down and
it may become unresponsive.
In this presentation I'll show a small Python tool that uses Invenio
API and a SQLite database as cache to keep an up to date flat file
with your bibliographic records.
We'll see how whith this flat file it is much faster and easier to do
tasks like generate specialised statistics, quality control, automatic
record enrichment or cleaning, or even creating exotic indexes or
counters. |
id | cern-2259672 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2017 |
record_format | invenio |
spelling | cern-22596722022-11-02T22:13:00Zhttp://cds.cern.ch/record/2259672engJorba, FerranThings you can do dumping your Invenio database into a flat fileInvenio User Group Workshop 2017Invenio User Group Workshops<!--HTML-->Invenio database design and interfaces are optimized for fast end user search and retrieval. As administrators, we can add indexes at will and use them via web or API. However, many maintenance tasks are not well covered with those indexes. For most of those cases, reading the records sequentialy is the optimal solution. However, if the database is large enough, reading them via Invenio API may take hours, while the system slows down and it may become unresponsive. In this presentation I'll show a small Python tool that uses Invenio API and a SQLite database as cache to keep an up to date flat file with your bibliographic records. We'll see how whith this flat file it is much faster and easier to do tasks like generate specialised statistics, quality control, automatic record enrichment or cleaning, or even creating exotic indexes or counters.oai:cds.cern.ch:22596722017 |
spellingShingle | Invenio User Group Workshops Jorba, Ferran Things you can do dumping your Invenio database into a flat file |
title | Things you can do dumping your Invenio database into a flat file |
title_full | Things you can do dumping your Invenio database into a flat file |
title_fullStr | Things you can do dumping your Invenio database into a flat file |
title_full_unstemmed | Things you can do dumping your Invenio database into a flat file |
title_short | Things you can do dumping your Invenio database into a flat file |
title_sort | things you can do dumping your invenio database into a flat file |
topic | Invenio User Group Workshops |
url | http://cds.cern.ch/record/2259672 |
work_keys_str_mv | AT jorbaferran thingsyoucandodumpingyourinveniodatabaseintoaflatfile AT jorbaferran inveniousergroupworkshop2017 |