This document discusses qCUDA-ARM, a virtualization solution for embedded GPU architectures. It provides an overview of edge computing challenges, GPU virtualization methods like API remoting, and the design and implementation of qCUDA-ARM. Key aspects covered include using address space conversion to enable zero-copy memory between guest and host, a pinned memory allocation mechanism for ARM, and improvements to memory bandwidth when copying data. The document evaluates qCUDA-ARM's performance on various benchmarks and applications compared to rCUDA and native CUDA.
Related topics: