One of the papers even won the SIGMOD Best Paper Award. The paper entitled “Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects,” by Clemens Lutz, Sebastian Breß, Steffen Zeuch, Tilmann Rabl, and BIFOLD Co-Director Volker Markl explores the use of GPUs to accelerate database query processing.
“Fast interconnects enable us to pump up the data volume, with high performance.”
Clemens Lutz, PhD student and researcher in the IAM (DFKI) and DIMA (TUB) research groups
GPUs are generally ill-suited for large-scale data processing for two reasons: (1) the on-board memory capacity is too small to store large datasets, and (2) the interconnect bandwidth to CPU main-memory is insufficient for ad-hoc data transfers. As a result, GPU-based systems face data transfer bottlenecks and do not scale to large datasets. In the paper, the authors demonstrate how a fast interconnect, such as NVLink 2.0 (linking dedicated GPUs to a CPU) can overcome the two scalability issues for a no-partitioning hash join. Consequently, the experiments achieved a speed-up of up to 18x over PCI-e 3.0 and up to 7.3x over an optimized CPU implementation.
Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects
Authors: Clemens Lutz, Sebastian Breß, Steffen Zeuch, Tilmann Rabl, and Volker Markl
Abstract: GPUs have long been discussed as accelerators for database query processing because of their high processing power and memory bandwidth. However, two main challenges limit the utility of GPUs for large-scale data processing: (1) the onboard memory capacity is too small to store large data sets, yet (2) the interconnect bandwidth to CPU main-memory is insufficient for ad-hoc data transfers. As a result, GPU-based systems and algorithms run into a transfer bottleneck and do not scale to large data sets. In practice, CPUs process large-scale data faster than GPUs with current technology. In this paper, we investigate how a fast interconnect can resolve these scalability limitations using the example of NVLink 2.0. NVLink 2.0 is a new interconnect technology that links dedicated GPUs to a CPU. The high bandwidth of NVLink 2.0 enables us to overcome the transfer bottleneck and to efficiently process large data sets stored in main-memory on GPUs. We perform an in-depth analysis of NVLink 2.0 and show how we can scale a no-partitioning hash join beyond the limits of GPU memory. Our evaluation shows speedups of up to 18× over PCI-e 3.0 and up to 7.3× over an optimized CPU implementation. Fast GPU interconnects thus enable GPUs to efficiently accelerate query processing.