Cargando…
OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems
With technology scaling, maintaining the reliability of dynamic random-access memory (DRAM) has become more challenging. Therefore, on-die error correction codes have been introduced to accommodate reliability issues in DDR5. However, the current solution still suffers from high overhead when a larg...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8708231/ https://www.ncbi.nlm.nih.gov/pubmed/34960359 http://dx.doi.org/10.3390/s21248271 |
_version_ | 1784622632309620736 |
---|---|
author | Nguyen, Duy-Thanh Ho, Nhut-Minh Wong, Weng-Fai Chang, Ik-Joon |
author_facet | Nguyen, Duy-Thanh Ho, Nhut-Minh Wong, Weng-Fai Chang, Ik-Joon |
author_sort | Nguyen, Duy-Thanh |
collection | PubMed |
description | With technology scaling, maintaining the reliability of dynamic random-access memory (DRAM) has become more challenging. Therefore, on-die error correction codes have been introduced to accommodate reliability issues in DDR5. However, the current solution still suffers from high overhead when a large DRAM capacity is used to deliver high performance. We present a DRAM chip architecture that can track faults at byte-level DRAM cell errors to address this problem. DRAM faults are classified as temporary or permanent in our proposed architecture, with no additional pins and with minor DRAM chip modifications. Hence, we achieve reliability comparable to that of other state-of-the-art solutions while incurring negligible performance and energy overhead. Furthermore, the faulty locations are efficiently exposed to the operating system (OS). Thus, we can significantly reduce the required scrubbing cycle by scrubbing only faulty DRAM pages while reducing the system failure probability up to 5000∼7000 times relative to conventional operation. |
format | Online Article Text |
id | pubmed-8708231 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-87082312021-12-25 OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems Nguyen, Duy-Thanh Ho, Nhut-Minh Wong, Weng-Fai Chang, Ik-Joon Sensors (Basel) Article With technology scaling, maintaining the reliability of dynamic random-access memory (DRAM) has become more challenging. Therefore, on-die error correction codes have been introduced to accommodate reliability issues in DDR5. However, the current solution still suffers from high overhead when a large DRAM capacity is used to deliver high performance. We present a DRAM chip architecture that can track faults at byte-level DRAM cell errors to address this problem. DRAM faults are classified as temporary or permanent in our proposed architecture, with no additional pins and with minor DRAM chip modifications. Hence, we achieve reliability comparable to that of other state-of-the-art solutions while incurring negligible performance and energy overhead. Furthermore, the faulty locations are efficiently exposed to the operating system (OS). Thus, we can significantly reduce the required scrubbing cycle by scrubbing only faulty DRAM pages while reducing the system failure probability up to 5000∼7000 times relative to conventional operation. MDPI 2021-12-10 /pmc/articles/PMC8708231/ /pubmed/34960359 http://dx.doi.org/10.3390/s21248271 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Nguyen, Duy-Thanh Ho, Nhut-Minh Wong, Weng-Fai Chang, Ik-Joon OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems |
title | OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems |
title_full | OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems |
title_fullStr | OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems |
title_full_unstemmed | OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems |
title_short | OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems |
title_sort | obet: on-the-fly byte-level error tracking for correcting and detecting faults in unreliable dram systems |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8708231/ https://www.ncbi.nlm.nih.gov/pubmed/34960359 http://dx.doi.org/10.3390/s21248271 |
work_keys_str_mv | AT nguyenduythanh obetontheflybytelevelerrortrackingforcorrectinganddetectingfaultsinunreliabledramsystems AT honhutminh obetontheflybytelevelerrortrackingforcorrectinganddetectingfaultsinunreliabledramsystems AT wongwengfai obetontheflybytelevelerrortrackingforcorrectinganddetectingfaultsinunreliabledramsystems AT changikjoon obetontheflybytelevelerrortrackingforcorrectinganddetectingfaultsinunreliabledramsystems |