NGS: A network GPGPU system for orchestrating remote and virtual accelerators

Javier Prades*, Carlos Reaño, Federico Silla

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Downloads (Pure)

Abstract

In General-Purpose computing on Graphics Processing Unit (GPGPU), the use of CPUs is combined with that of GPUs. CPUs are used for sequential code, while GPUs are used for parallel code. GPGPU has been enabled by two key factors: (i) the massively parallel architecture of GPUs, which allows thousands of single cores to run parallel code; and (ii) the development of platforms, such as CUDA, that simplify implementing code for GPUs. GPGPU has established itself as the standard computing system in most computing fields due to the great improvements it brings. However, its use is not without problems, such as GPU underutilization, high cost, power consumption, etc. In this paper we present NGS (Network GPGPU System) to address the underutilization of GPUs in computing centers. NGS orchestrates the concurrent access to GPGPU resources from different nodes of the cluster by leveraging the remote GPU virtualization mechanism and the NVML library by NVIDIA. In this way, NGS enables different nodes of the cluster to access remote GPUs as if they were local at the same time that this access is guaranteed to be carried out without collisions. The main novelty is that NGS offers a global and standard solution independent of the computing environment used. Experimental results show up to 4x improvements compared to popular approaches.

Original languageEnglish
Article number103138
JournalJournal of Systems Architecture
Volume151
Early online date12 Apr 2024
DOIs
Publication statusPublished - Jun 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2024 The Authors

Keywords

  • CUDA
  • GPU
  • Scheduling
  • Virtualization

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'NGS: A network GPGPU system for orchestrating remote and virtual accelerators'. Together they form a unique fingerprint.

Cite this