Field Programmable Gate Array (FPGA) boast abundant resources with which to realise high-performance accelerators for computationally demanding operations. Highly efficient accelerators may be automatically derived from Signal Flow Graph (SFG) models by using architectural synthesis techniques, but in practical design scenarios, these currently operate under two important limitations - they cannot efficiently harness the programmable datapath components which make up an increasing proportion of the computational capacity of modern FPGA and they are unable to automatically derive accelerators to meet a prescribed throughput or latency requirement. This paper addresses these limitations. SFG synthesis is enabled which derives software-programmable multicore single-instruction, multiple-data (SIMD) accelerators which, via combined offline characterisation of multicore performance and compile-time program analysis, meet prescribed throughput requirements. The effectiveness of these techniques is demonstrated on tree-search and linear algebraic accelerators for 802.11n WiFi transceivers, an application for which satisfying real-time performance requirements has, to this point, proven challenging for even manually-derived architectures.
|Journal||IEEE Transactions on Parallel and Distributed Systems|
|Early online date||29 Aug 2017|
|Publication status||Published - 01 Jan 2018|