Cargando…

Distributed consensus and fault tolerance - Lecture 2

In a world where clusters with thousands of nodes are becoming commonplace, we are often faced with the task of having them coordinate and share state. As the number of machines goes up, so does the probability that something goes wrong: a node could temporarily lose connectivity, c...

Descripción completa

Detalles Bibliográficos
Autor principal:	Bitzes, Georgios
Lenguaje:	eng
Publicado:	2017
Materias:	inverted CSC
Acceso en línea:	http://cds.cern.ch/record/2255145

_version_	1780953686895230976
author	Bitzes, Georgios
author_facet	Bitzes, Georgios
author_sort	Bitzes, Georgios
collection	CERN
description	<!--HTML-->In a world where clusters with thousands of nodes are becoming commonplace, we are often faced with the task of having them coordinate and share state. As the number of machines goes up, so does the probability that something goes wrong: a node could temporarily lose connectivity, crash because of some race condition, or have its hard drive fail. What are the challenges when designing fault-tolerant distributed systems, where a cluster is able to survive the loss of individual nodes? In this lecture, we will discuss some basics on this topic (consistency models, CAP theorem, failure modes, byzantine faults), detail the raft consensus algorithm, and showcase an interesting example of a highly resilient distributed system, bitcoin.
id	cern-2255145
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2017
record_format	invenio
spelling	cern-22551452022-11-02T22:32:27Zhttp://cds.cern.ch/record/2255145engBitzes, GeorgiosDistributed consensus and fault tolerance - Lecture 2Inverted CERN School of Computing 2017inverted CSC<!--HTML-->In a world where clusters with thousands of nodes are becoming commonplace, we are often faced with the task of having them coordinate and share state. As the number of machines goes up, so does the probability that something goes wrong: a node could temporarily lose connectivity, crash because of some race condition, or have its hard drive fail. What are the challenges when designing fault-tolerant distributed systems, where a cluster is able to survive the loss of individual nodes? In this lecture, we will discuss some basics on this topic (consistency models, CAP theorem, failure modes, byzantine faults), detail the raft consensus algorithm, and showcase an interesting example of a highly resilient distributed system, bitcoin.oai:cds.cern.ch:22551452017
spellingShingle	inverted CSC Bitzes, Georgios Distributed consensus and fault tolerance - Lecture 2
title	Distributed consensus and fault tolerance - Lecture 2
title_full	Distributed consensus and fault tolerance - Lecture 2
title_fullStr	Distributed consensus and fault tolerance - Lecture 2
title_full_unstemmed	Distributed consensus and fault tolerance - Lecture 2
title_short	Distributed consensus and fault tolerance - Lecture 2
title_sort	distributed consensus and fault tolerance - lecture 2
topic	inverted CSC
url	http://cds.cern.ch/record/2255145
work_keys_str_mv	AT bitzesgeorgios distributedconsensusandfaulttolerancelecture2 AT bitzesgeorgios invertedcernschoolofcomputing2017

Distributed consensus and fault tolerance - Lecture 2

Ejemplares similares