This paper presents a new implementation, with complete analysis, of the processing operations required in a widely-used pedestrian detection algorithm (the histogram of oriented gradients (HOG) detector) when run in various configurations on a heterogeneous platform suitable for use as an embedded system. The platform consists of field-programmable gate array (FPGA), graphics processing unit (GPU), and central processing unit (CPU) and we detail the advantages of such an image processing system for real-time performance. We thoroughly analyze the consequent tradeoffs made between power consumption, latency and accuracy for each possible configuration. We thus demonstrate that prioritization of each of these factors can be made by selecting a specific configuration. These separate configurations may then be changed dynamically to respond to changing priorities of a real-time system, e.g., on a moving vehicle. We compare the performance of real-time implementations of linear and kernel support vector machines in HOG and evaluate the entire system against the state-of-the-art in real-time person detection. We also show that our FPGA implementation detects pedestrians more accurately than existing implementations, and that a heterogeneous configuration which performs image scaling on the GPU, and histogram extraction and classification on the FPGA, produces a good compromise between power and speed.
|Pages (from-to)||236 - 247|
|Number of pages||12|
|Journal||IEEE Journal on Emerging and Selected Topics in Circuits and Systems|
|Early online date||25 Apr 2013|
|Publication status||Published - 29 Apr 2013|