NVLink
While the GPUs are attached to the host via PCIe, the GPUs within a host node are connected to each other via NVLink. You can check these data by nvidia-smi
command,
ab123456@login18-g-1:~[505]$ nvidia-smi nvlink --status -i 0
GPU 0: Tesla V100-SXM2-16GB (UUID: GPU-388a149d-072d-881b-469c-a3e3c7c4dc19)
Link 0: 25.781 GB/s
Link 1: <inactive>
Link 2: 25.781 GB/s
Link 3: <inactive>
Link 4: <inactive>
Link 5: <inactive>
ab123456@login18-g-1:~[506]$ nvidia-smi nvlink --status -i 1
GPU 1: Tesla V100-SXM2-16GB (UUID: GPU-142d0ca2-b09d-9c28-5583-7b3b749571f9)
Link 0: <inactive>
Link 1: 25.781 GB/s
Link 2: 25.781 GB/s
Link 3: <inactive>
Link 4: <inactive>
Link 5: <inactive>
If you want to use NVLink, you might need to adapt your program correspondingly. In CUDA, this could be done in the following:
unsigned int flags = 0;
cudaSetDevice(0);
cudaDeviceEnablePeerAccess(1,flags);
...
cudaSetDevice(1);
cudaDeviceEnablePeerAccess(0,flags);
...
// for example; copy data between both GPUs
cudaMemcpy(<dataDest>, <dataSrc>, <size>, cudaMemcpyDeviceToDevice); // Be aware: This can be an asynchronous copy over NVLink!