Cargando…
Sector and Sphere: the design and implementation of a high-performance data cloud
Cloud computing has demonstrated that processing very large datasets over commodity clusters can be done simply, given the right programming model and infrastructure. In this paper, we describe the design and implementation of the Sector storage cloud and the Sphere compute cloud. By contrast with t...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3391065/ https://www.ncbi.nlm.nih.gov/pubmed/19451100 http://dx.doi.org/10.1098/rsta.2009.0053 |
_version_ | 1782237484188434432 |
---|---|
author | Gu, Yunhong Grossman, Robert L. |
author_facet | Gu, Yunhong Grossman, Robert L. |
author_sort | Gu, Yunhong |
collection | PubMed |
description | Cloud computing has demonstrated that processing very large datasets over commodity clusters can be done simply, given the right programming model and infrastructure. In this paper, we describe the design and implementation of the Sector storage cloud and the Sphere compute cloud. By contrast with the existing storage and compute clouds, Sector can manage data not only within a data centre, but also across geographically distributed data centres. Similarly, the Sphere compute cloud supports user-defined functions (UDFs) over data both within and across data centres. As a special case, MapReduce-style programming can be implemented in Sphere by using a Map UDF followed by a Reduce UDF. We describe some experimental studies comparing Sector/Sphere and Hadoop using the Terasort benchmark. In these studies, Sector is approximately twice as fast as Hadoop. Sector/Sphere is open source. |
format | Online Article Text |
id | pubmed-3391065 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | The Royal Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-33910652012-07-12 Sector and Sphere: the design and implementation of a high-performance data cloud Gu, Yunhong Grossman, Robert L. Philos Trans A Math Phys Eng Sci Articles Cloud computing has demonstrated that processing very large datasets over commodity clusters can be done simply, given the right programming model and infrastructure. In this paper, we describe the design and implementation of the Sector storage cloud and the Sphere compute cloud. By contrast with the existing storage and compute clouds, Sector can manage data not only within a data centre, but also across geographically distributed data centres. Similarly, the Sphere compute cloud supports user-defined functions (UDFs) over data both within and across data centres. As a special case, MapReduce-style programming can be implemented in Sphere by using a Map UDF followed by a Reduce UDF. We describe some experimental studies comparing Sector/Sphere and Hadoop using the Terasort benchmark. In these studies, Sector is approximately twice as fast as Hadoop. Sector/Sphere is open source. The Royal Society 2009-06-28 /pmc/articles/PMC3391065/ /pubmed/19451100 http://dx.doi.org/10.1098/rsta.2009.0053 Text en Copyright © 2009 The Royal Society http://creativecommons.org/licenses/by/2.5/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Articles Gu, Yunhong Grossman, Robert L. Sector and Sphere: the design and implementation of a high-performance data cloud |
title | Sector and Sphere: the design and implementation of a high-performance data cloud |
title_full | Sector and Sphere: the design and implementation of a high-performance data cloud |
title_fullStr | Sector and Sphere: the design and implementation of a high-performance data cloud |
title_full_unstemmed | Sector and Sphere: the design and implementation of a high-performance data cloud |
title_short | Sector and Sphere: the design and implementation of a high-performance data cloud |
title_sort | sector and sphere: the design and implementation of a high-performance data cloud |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3391065/ https://www.ncbi.nlm.nih.gov/pubmed/19451100 http://dx.doi.org/10.1098/rsta.2009.0053 |
work_keys_str_mv | AT guyunhong sectorandspherethedesignandimplementationofahighperformancedatacloud AT grossmanrobertl sectorandspherethedesignandimplementationofahighperformancedatacloud |