Cargando…
PROX: Approximated Summarization of Data Provenance
Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand the application logic and how infor...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001561/ https://www.ncbi.nlm.nih.gov/pubmed/27570843 |
_version_ | 1782450442440015872 |
---|---|
author | Ainy, Eleanor Bourhis, Pierre Davidson, Susan B. Deutch, Daniel Milo, Tova |
author_facet | Ainy, Eleanor Bourhis, Pierre Davidson, Susan B. Deutch, Daniel Milo, Tova |
author_sort | Ainy, Eleanor |
collection | PubMed |
description | Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand the application logic and how information was derived. Data provenance has been proven helpful in this respect in different contexts; however, maintaining and presenting the full and exact provenance may be infeasible, due to its size and complex structure. For that reason, we introduce the notion of approximated summarized provenance, where we seek a compact representation of the provenance at the possible cost of information loss. Based on this notion, we have developed PROX, a system for the management, presentation and use of data provenance for complex applications. We propose to demonstrate PROX in the context of a movies rating crowd-sourcing system, letting participants view provenance summarization and use it to gain insights on the application and its underlying data. |
format | Online Article Text |
id | pubmed-5001561 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
record_format | MEDLINE/PubMed |
spelling | pubmed-50015612016-08-26 PROX: Approximated Summarization of Data Provenance Ainy, Eleanor Bourhis, Pierre Davidson, Susan B. Deutch, Daniel Milo, Tova Adv Database Technol Article Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand the application logic and how information was derived. Data provenance has been proven helpful in this respect in different contexts; however, maintaining and presenting the full and exact provenance may be infeasible, due to its size and complex structure. For that reason, we introduce the notion of approximated summarized provenance, where we seek a compact representation of the provenance at the possible cost of information loss. Based on this notion, we have developed PROX, a system for the management, presentation and use of data provenance for complex applications. We propose to demonstrate PROX in the context of a movies rating crowd-sourcing system, letting participants view provenance summarization and use it to gain insights on the application and its underlying data. 2016-03 /pmc/articles/PMC5001561/ /pubmed/27570843 Text en http://creativecommons.org/licenses/by-nc-nd/4.0/ Distribution of this paper is permitted under the terms of the Creative Commons license CC-by-nc-nd 4.0 |
spellingShingle | Article Ainy, Eleanor Bourhis, Pierre Davidson, Susan B. Deutch, Daniel Milo, Tova PROX: Approximated Summarization of Data Provenance |
title | PROX: Approximated Summarization of Data Provenance |
title_full | PROX: Approximated Summarization of Data Provenance |
title_fullStr | PROX: Approximated Summarization of Data Provenance |
title_full_unstemmed | PROX: Approximated Summarization of Data Provenance |
title_short | PROX: Approximated Summarization of Data Provenance |
title_sort | prox: approximated summarization of data provenance |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001561/ https://www.ncbi.nlm.nih.gov/pubmed/27570843 |
work_keys_str_mv | AT ainyeleanor proxapproximatedsummarizationofdataprovenance AT bourhispierre proxapproximatedsummarizationofdataprovenance AT davidsonsusanb proxapproximatedsummarizationofdataprovenance AT deutchdaniel proxapproximatedsummarizationofdataprovenance AT milotova proxapproximatedsummarizationofdataprovenance |