Cargando…

On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics

Cloud Computing services for data analytics are increasingly being sought by companies to extract value from large quantities of information. However, processing data from individuals and companies in third-party infrastructures raises several privacy concerns. To this end, different secure analytic...

Descripción completa

Detalles Bibliográficos
Autores principales: Carvalho, Hugo, Cruz, Daniel, Pontes, Rogério, Paulo, João, Oliveira, Rui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7276258/
http://dx.doi.org/10.1007/978-3-030-50323-9_1
_version_ 1783542924414287872
author Carvalho, Hugo
Cruz, Daniel
Pontes, Rogério
Paulo, João
Oliveira, Rui
author_facet Carvalho, Hugo
Cruz, Daniel
Pontes, Rogério
Paulo, João
Oliveira, Rui
author_sort Carvalho, Hugo
collection PubMed
description Cloud Computing services for data analytics are increasingly being sought by companies to extract value from large quantities of information. However, processing data from individuals and companies in third-party infrastructures raises several privacy concerns. To this end, different secure analytics techniques and systems have recently emerged. These initial proposals leverage specific cryptographic primitives lacking generality and thus having their application restricted to particular application scenarios. In this work, we contribute to this thriving body of knowledge by combining two complementary approaches to process sensitive data. We present SafeSpark, a secure data analytics framework that enables the combination of different cryptographic processing techniques with hardware-based protected environments for privacy-preserving data storage and processing. SafeSpark is modular and extensible therefore adapting to data analytics applications with different performance, security and functionality requirements. We have implemented a SafeSpark’s prototype based on Spark SQL and Intel SGX hardware. It has been evaluated with the TPC-DS Benchmark under three scenarios using different cryptographic primitives and secure hardware configurations. These scenarios provide a particular set of security guarantees and yield distinct performance impact, with overheads ranging from as low as 10% to an acceptable 300% when compared to an insecure vanilla deployment of Apache Spark.
format Online
Article
Text
id pubmed-7276258
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72762582020-06-08 On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics Carvalho, Hugo Cruz, Daniel Pontes, Rogério Paulo, João Oliveira, Rui Distributed Applications and Interoperable Systems Article Cloud Computing services for data analytics are increasingly being sought by companies to extract value from large quantities of information. However, processing data from individuals and companies in third-party infrastructures raises several privacy concerns. To this end, different secure analytics techniques and systems have recently emerged. These initial proposals leverage specific cryptographic primitives lacking generality and thus having their application restricted to particular application scenarios. In this work, we contribute to this thriving body of knowledge by combining two complementary approaches to process sensitive data. We present SafeSpark, a secure data analytics framework that enables the combination of different cryptographic processing techniques with hardware-based protected environments for privacy-preserving data storage and processing. SafeSpark is modular and extensible therefore adapting to data analytics applications with different performance, security and functionality requirements. We have implemented a SafeSpark’s prototype based on Spark SQL and Intel SGX hardware. It has been evaluated with the TPC-DS Benchmark under three scenarios using different cryptographic primitives and secure hardware configurations. These scenarios provide a particular set of security guarantees and yield distinct performance impact, with overheads ranging from as low as 10% to an acceptable 300% when compared to an insecure vanilla deployment of Apache Spark. 2020-05-15 /pmc/articles/PMC7276258/ http://dx.doi.org/10.1007/978-3-030-50323-9_1 Text en © IFIP International Federation for Information Processing 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Carvalho, Hugo
Cruz, Daniel
Pontes, Rogério
Paulo, João
Oliveira, Rui
On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics
title On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics
title_full On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics
title_fullStr On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics
title_full_unstemmed On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics
title_short On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics
title_sort on the trade-offs of combining multiple secure processing primitives for data analytics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7276258/
http://dx.doi.org/10.1007/978-3-030-50323-9_1
work_keys_str_mv AT carvalhohugo onthetradeoffsofcombiningmultiplesecureprocessingprimitivesfordataanalytics
AT cruzdaniel onthetradeoffsofcombiningmultiplesecureprocessingprimitivesfordataanalytics
AT pontesrogerio onthetradeoffsofcombiningmultiplesecureprocessingprimitivesfordataanalytics
AT paulojoao onthetradeoffsofcombiningmultiplesecureprocessingprimitivesfordataanalytics
AT oliveirarui onthetradeoffsofcombiningmultiplesecureprocessingprimitivesfordataanalytics