Projects per year
Abstract
Improving energy efficiency of the memory subsystem becomes increasingly important for all digital systems due to the rapid growth of data. Many recent schemes have attempted to reduce the DRAM power by relaxing the refresh rate, which may negatively affect the DRAM reliability. To optimize the trade-offs between power and reliability, existing studies rely on experimental setups based on FPGAs and the use of few known data-patterns for exciting rare worst-case circuit reliability effects. However, by doing so, existing studies may be missing to capture the real DRAM behavior within a commodity server with a fully fledged OS.
In this paper, we develop an experimental framework based on a state-of-the-art 64-bit ARM based server with Linux OS, in which we enabled the characterization of 72 DRAM chips under relaxed refresh period and various temperatures controlled by a unique thermal testbed. We evaluate the DRAM reliability running single and multi-threaded HPC workloads on such a commodity server with a fully-fledged Linux OS and a typical multilevel memory hierarchy. In fact, our results show that the manifested Word-ErrorRate under relaxed refresh period varies among the workloads and can be different from the one estimated by the few known fixed data-patterns that were conventionally used in all existing studies. We also discover that the error rates incurred by the execution of the HPC workloads may vary within a program run. Finally, our study shows that the refresh period can be relaxed by 35× leading to 11.2 % power savings on average, while avoiding any system disruption, since the available error-correcting-codes were able to correct all incurred errors up to 60 ◦C.
In this paper, we develop an experimental framework based on a state-of-the-art 64-bit ARM based server with Linux OS, in which we enabled the characterization of 72 DRAM chips under relaxed refresh period and various temperatures controlled by a unique thermal testbed. We evaluate the DRAM reliability running single and multi-threaded HPC workloads on such a commodity server with a fully-fledged Linux OS and a typical multilevel memory hierarchy. In fact, our results show that the manifested Word-ErrorRate under relaxed refresh period varies among the workloads and can be different from the one estimated by the few known fixed data-patterns that were conventionally used in all existing studies. We also discover that the error rates incurred by the execution of the HPC workloads may vary within a program run. Finally, our study shows that the refresh period can be relaxed by 35× leading to 11.2 % power savings on average, while avoiding any system disruption, since the available error-correcting-codes were able to correct all incurred errors up to 60 ◦C.
Original language | English |
---|---|
Title of host publication | Proceeding SAMOS '18 |
Subtitle of host publication | Proceedings of the 2018 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS), Greece. 2018 |
Publisher | ACM |
Pages | 230-235 |
ISBN (Print) | 978-1-4503-6494-2 |
DOIs | |
Publication status | Early online date - 15 Jul 2018 |
Event | IEEE International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation - Samos, Greece Duration: 15 Jul 2018 → 19 Jul 2018 |
Conference
Conference | IEEE International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation |
---|---|
Abbreviated title | samos2018 |
Country/Territory | Greece |
City | Samos |
Period | 15/07/2018 → 19/07/2018 |
Fingerprint
Dive into the research topics of 'Characterization of HPC workloads on an ARMv8 based server under relaxed DRAM refresh and thermal stress'. Together they form a unique fingerprint.Projects
- 2 Active
-
R6551CSC: Open TransPREcision COMPuting
Woods, R., Karakonstantis, G. & Vandierendonck, H.
03/11/2016 → …
Project: Research
-
R6529CSC: A Universal Micro-Server Ecosystem by Exceeding the Energy and Performance Scaling Boundaries
Karakonstantis, G., Nikolopoulos, D., O'Neill, M. & Vandierendonck, H.
17/12/2015 → …
Project: Research