ARETE: accurate error assessment via machine learning-guided dynamic-timing analysis

Ioannis Tsiokanos, Styliani Tompazi, Giorgis Georgakoudis, Lev Mukhanov, Georgios Karakonstantis

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Nanometer circuits are increasingly prone to timing errors, escalating the need for fault injection frameworks to accurately evaluate their impact on applications. In this paper, we propose ARETE, a novel cross-layer, fault-injection framework that combines dynamic-binary instrumentation with machine learning-guided dynamic-timing analysis. ARETE enables accurate fault-injection into any application by estimating the location of the injecting errors via dynamic-timing analysis. To accelerate fault-injection, we develop a novel, data-aware, machine learning-based mechanism that dynamically pre-selects the error-prone instructions and limits the application of the costly dynamic-timing analysis only to them. To evaluate ARETE's accuracy, our fully automated toolflow is configured to support fault-injection based on detailed post-layout gate-level simulations as well as via existing workload-agnostic error models. Our results for various workloads, including an autonomous-driving library, show that the location and time of injected errors performed by ARETE, is 89.9% consistent with fault-injection based on full gate-level simulation. On average, ARETE executes 84.6× faster than gate-level simulation and at a cost of 3.4% loss in the program output quality estimation. When compared to the existing statistical fault-injection tools that are based on workload-agnostic error models, ARETE improves the accuracy of fault-injection rate and output quality estimation by 143.9% and 40.4% on average, respectively.

Original languageEnglish
Pages (from-to)1026-1040
Number of pages14
JournalIEEE Transactions on Computers
Volume72
Issue number4
Early online date18 Jul 2022
DOIs
Publication statusPublished - 01 Apr 2023

Keywords

  • Circuit faults
  • Computational modeling
  • Cross-layer fault injection
  • Delays
  • dynamic binary instrumentation
  • dynamic timing analysis
  • fault injection
  • Integrated circuit modeling
  • Logic gates
  • machine learning
  • Microarchitecture
  • Pipelines
  • timing error evaluation

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'ARETE: accurate error assessment via machine learning-guided dynamic-timing analysis'. Together they form a unique fingerprint.

Cite this