RDMA with GPU Memory via DMA-Buf

Published: 22 June 2020
on channel: insideHPC Report

2,586

Jianxin Xiong, Intel, Corp.
Jianxin Xiong is a Software Engineer at Intel. Over the past 15+ years he has worked on various layers of interconnection software stack, such as RDMA drivers in Linux kernel, RDMA device virtualization, Open Fabric Interface, DAPL, Tag Matching Interface, and Intel MPI. His current focus is GPU/accelerator scale-out with RDMA devices.

Discrete GPUs have been widely used in systems for high performance data parallel computations. Scale-out configuration of such systems often include RDMA capable NICs to provide high bandwidth, low latency inter-node communication. Over the PCIe bus, the GPU appears as peer device of the NIC and extra steps are needed to set up GPU memory for RDMA operations. Proprietary solutions such as Peer-Direct from Mellanox have existed for a while for this purpose. However, direct use of GPU memory in RDMA operations (A.K.A. GPU Direct RDMA) is still unsupported by upstream RDMA drivers. Dma-buf is a standard mechanism in Linux kernel for sharing buffers for DMA access across different device drivers and subsystems. In this talk, a prototype is presented that utilizes dma-buf to enable peer-to-peer DMA between the NIC and GPU memory. The required changes in the kernel RDMA driver, user space RDMA core libraries, as well as Open Fabric Interface library (libfabric) are discussed in detail. The goal is to provide a non-proprietary approach to enable direct RDMA to/from GPU memory.

Watch video RDMA with GPU Memory via DMA-Buf online without registration, duration hours minute second in high quality. This video was added by user insideHPC Report 22 June 2020, don't forget to share it with your friends and acquaintances, it has been viewed on our site 2,586 once and liked it 26 people.

216