Cargando…

PROX: Approximated Summarization of Data Provenance

Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand the application logic and how infor...

Descripción completa

Detalles Bibliográficos
Autores principales: Ainy, Eleanor, Bourhis, Pierre, Davidson, Susan B., Deutch, Daniel, Milo, Tova
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001561/
https://www.ncbi.nlm.nih.gov/pubmed/27570843
_version_ 1782450442440015872
author Ainy, Eleanor
Bourhis, Pierre
Davidson, Susan B.
Deutch, Daniel
Milo, Tova
author_facet Ainy, Eleanor
Bourhis, Pierre
Davidson, Susan B.
Deutch, Daniel
Milo, Tova
author_sort Ainy, Eleanor
collection PubMed
description Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand the application logic and how information was derived. Data provenance has been proven helpful in this respect in different contexts; however, maintaining and presenting the full and exact provenance may be infeasible, due to its size and complex structure. For that reason, we introduce the notion of approximated summarized provenance, where we seek a compact representation of the provenance at the possible cost of information loss. Based on this notion, we have developed PROX, a system for the management, presentation and use of data provenance for complex applications. We propose to demonstrate PROX in the context of a movies rating crowd-sourcing system, letting participants view provenance summarization and use it to gain insights on the application and its underlying data.
format Online
Article
Text
id pubmed-5001561
institution National Center for Biotechnology Information
language English
publishDate 2016
record_format MEDLINE/PubMed
spelling pubmed-50015612016-08-26 PROX: Approximated Summarization of Data Provenance Ainy, Eleanor Bourhis, Pierre Davidson, Susan B. Deutch, Daniel Milo, Tova Adv Database Technol Article Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand the application logic and how information was derived. Data provenance has been proven helpful in this respect in different contexts; however, maintaining and presenting the full and exact provenance may be infeasible, due to its size and complex structure. For that reason, we introduce the notion of approximated summarized provenance, where we seek a compact representation of the provenance at the possible cost of information loss. Based on this notion, we have developed PROX, a system for the management, presentation and use of data provenance for complex applications. We propose to demonstrate PROX in the context of a movies rating crowd-sourcing system, letting participants view provenance summarization and use it to gain insights on the application and its underlying data. 2016-03 /pmc/articles/PMC5001561/ /pubmed/27570843 Text en http://creativecommons.org/licenses/by-nc-nd/4.0/ Distribution of this paper is permitted under the terms of the Creative Commons license CC-by-nc-nd 4.0
spellingShingle Article
Ainy, Eleanor
Bourhis, Pierre
Davidson, Susan B.
Deutch, Daniel
Milo, Tova
PROX: Approximated Summarization of Data Provenance
title PROX: Approximated Summarization of Data Provenance
title_full PROX: Approximated Summarization of Data Provenance
title_fullStr PROX: Approximated Summarization of Data Provenance
title_full_unstemmed PROX: Approximated Summarization of Data Provenance
title_short PROX: Approximated Summarization of Data Provenance
title_sort prox: approximated summarization of data provenance
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001561/
https://www.ncbi.nlm.nih.gov/pubmed/27570843
work_keys_str_mv AT ainyeleanor proxapproximatedsummarizationofdataprovenance
AT bourhispierre proxapproximatedsummarizationofdataprovenance
AT davidsonsusanb proxapproximatedsummarizationofdataprovenance
AT deutchdaniel proxapproximatedsummarizationofdataprovenance
AT milotova proxapproximatedsummarizationofdataprovenance