A paper on the accelerated loading of CSV data using GPUs and RDMA by researchers from the Database Systems and Information Management Group (DIMA) at TU Berlin and the Intelligent Analytics for Massive Data (IAM) research group at DFKI was accepted at the 19th symposium “Database Systems for Business, Technology and Web” (BTW 2021), which will take place from September 20 – 24, 2021.
In their Paper „Fast CSV Loading Using GPUs and RDMA for In-Memory Data Processing“ Alexander Kumaigorodski, Clemens Lutz and Volker Markl devise a new CSV parsing approach that streamlines the control flow while correctly handling context-sensitive CSV features. By offloading I/O and parsing to the GPU, their approach enables databases to load CSVs at high throughput from main memory with NVLink 2.0, as well as directly from the network with RDMA.
A preprint version of the paper is available here.