Cargando…
Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures
Non-volatile memory technology is now available in commodity hardware. This technology can be used as a backup memory for an external dram cache memory without needing to modify the software. However, the higher read and write latencies of non-volatile memory may exacerbate the memory wall problem....
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8439492/ https://www.ncbi.nlm.nih.gov/pubmed/34520473 http://dx.doi.org/10.1371/journal.pone.0257047 |
_version_ | 1783752536846499840 |
---|---|
author | Lamela, Adrián Ossorio, Óscar G. Vinuesa, Guillermo Sahelices, Benjamín |
author_facet | Lamela, Adrián Ossorio, Óscar G. Vinuesa, Guillermo Sahelices, Benjamín |
author_sort | Lamela, Adrián |
collection | PubMed |
description | Non-volatile memory technology is now available in commodity hardware. This technology can be used as a backup memory for an external dram cache memory without needing to modify the software. However, the higher read and write latencies of non-volatile memory may exacerbate the memory wall problem. In this work we present a novel off-chip prefetch technique based on a Hidden Markov Model that specifically deals with the latency problem caused by complexity of off-chip memory access patterns. Firstly, we present a thorough analysis of off-chip memory access patterns to identify its complexity in multicore processors. Based on this study, we propose a prefetching module located in the llc which uses two small tables, and where the computational complexity of which is linear with the number of computing threads. Our Markov-based technique is able to keep track and make clustering of several simultaneous groups of memory accesses coming from multiple simultaneous threads in a multicore processor. It can quickly identify complex address groups and trigger prefetch with very high accuracy. Our simulations show an improvement of up to 76% in the hit ratio of an off-chip dram cache for multicore architecture over the conventional prefetch technique (g/dc). Also, the overhead of prefetch requests (failed prefetches) is reduced by 48% in single core simulations and by 83% in multicore simulations. |
format | Online Article Text |
id | pubmed-8439492 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-84394922021-09-15 Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures Lamela, Adrián Ossorio, Óscar G. Vinuesa, Guillermo Sahelices, Benjamín PLoS One Research Article Non-volatile memory technology is now available in commodity hardware. This technology can be used as a backup memory for an external dram cache memory without needing to modify the software. However, the higher read and write latencies of non-volatile memory may exacerbate the memory wall problem. In this work we present a novel off-chip prefetch technique based on a Hidden Markov Model that specifically deals with the latency problem caused by complexity of off-chip memory access patterns. Firstly, we present a thorough analysis of off-chip memory access patterns to identify its complexity in multicore processors. Based on this study, we propose a prefetching module located in the llc which uses two small tables, and where the computational complexity of which is linear with the number of computing threads. Our Markov-based technique is able to keep track and make clustering of several simultaneous groups of memory accesses coming from multiple simultaneous threads in a multicore processor. It can quickly identify complex address groups and trigger prefetch with very high accuracy. Our simulations show an improvement of up to 76% in the hit ratio of an off-chip dram cache for multicore architecture over the conventional prefetch technique (g/dc). Also, the overhead of prefetch requests (failed prefetches) is reduced by 48% in single core simulations and by 83% in multicore simulations. Public Library of Science 2021-09-14 /pmc/articles/PMC8439492/ /pubmed/34520473 http://dx.doi.org/10.1371/journal.pone.0257047 Text en © 2021 Lamela et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Lamela, Adrián Ossorio, Óscar G. Vinuesa, Guillermo Sahelices, Benjamín Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures |
title | Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures |
title_full | Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures |
title_fullStr | Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures |
title_full_unstemmed | Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures |
title_short | Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures |
title_sort | off-chip prefetching based on hidden markov model for non-volatile memory architectures |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8439492/ https://www.ncbi.nlm.nih.gov/pubmed/34520473 http://dx.doi.org/10.1371/journal.pone.0257047 |
work_keys_str_mv | AT lamelaadrian offchipprefetchingbasedonhiddenmarkovmodelfornonvolatilememoryarchitectures AT ossoriooscarg offchipprefetchingbasedonhiddenmarkovmodelfornonvolatilememoryarchitectures AT vinuesaguillermo offchipprefetchingbasedonhiddenmarkovmodelfornonvolatilememoryarchitectures AT sahelicesbenjamin offchipprefetchingbasedonhiddenmarkovmodelfornonvolatilememoryarchitectures |