Cargando…

TeaMPI—Replication-Based Resilience Without the (Performance) Pain

In an era where we can not afford to checkpoint frequently, replication is a generic way forward to construct numerical simulations that can continue to run even if hardware parts fail. Yet, replication often is not employed on larger scales, as naïvely mirroring a computation once effectively halve...

Descripción completa

Detalles Bibliográficos
Autores principales: Samfass, Philipp, Weinzierl, Tobias, Hazelwood, Benjamin, Bader, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7295348/
http://dx.doi.org/10.1007/978-3-030-50743-5_23