Sie befinden sich im Service: RWTH High Performance Computing (Linux)

GPU Batch Slurm Jobs

GPU Batch Slurm Jobs

Detailinformation

 

Please Note

Loading a CUDA module may require loading additional modules. Check the output of module spider CUDA/<version> for information.

Simple GPU Example

Run deviceQuery (from NVIDIA SDK) on one device:

#!/usr/bin/zsh

#SBATCH -J gpu_serial
#SBATCH -o gpu_serial.%J.log
#SBATCH --gres=gpu:1

module load CUDA

# Print some debug information
echo; export; echo; nvidia-smi; echo

$CUDA_ROOT/extras/demo_suite/deviceQuery -noprompt

MPI + GPU

To run an MPI application on the GPU nodes, you need to take special care to correctly set the numer of MPI ranks per node. Typical setups are:

  1. One process per node (ppn):

    If your process uses both GPUs at the same time, e.g. via cudaSetDevice or by accepting CUDA_VISIBLE_DEVICES (set automatically by the batch system). Let ./cuda-mpi be the path to your MPI-compatible CUDA program:

    #!/usr/bin/zsh
    
    ### Setup in this script:
    ### - 4 nodes (c18g)
    ### - 1 rank per node
    ### - 2 GPUs per rank (= both GPUs from the node)
    #SBATCH -J 4-1-2
    #SBATCH -o 4-1-2.%J.log
    #SBATCH --ntasks=4
    #SBATCH --ntasks-per-node=1
    #SBATCH --gres=gpu:2
    
    module load CUDA
    
    # Print some debug information
    echo; export; echo; nvidia-smi; echo
    
    $MPIEXEC $FLAGS_MPI_BATCH ./cuda-mpi
    
  2. two processes per node (ppn):

    If each process communicates to its own single GPU and thus using both GPUs on a node (recommended setup). Let ./cuda-mpi be the path to your MPI-compatible CUDA program:

    #!/usr/bin/zsh
    
    ### Setup in this script:
    ### - 2 nodes (c18g, default)
    ### - 2 ranks per node
    ### - 1 GPU per rank (= both GPUs from the node)
    
    #SBATCH -J 2-2-1
    #SBATCH -o 2-2-1.%J.log
    #SBATCH --ntasks=4
    #SBATCH --ntasks-per-node=2
    #SBATCH --gres=gpu:2
    
    module load CUDA
    
    #print some debug informations...
    echo; export; echo; nvidia-smi; echo
    
    $MPIEXEC $FLAGS_MPI_BATCH ./cuda-mpi
    
  3. More than 2 processes per node:

    If you also have processes that do computation on the CPU only.


 Zusatzinformation

Information about the accounting of Slurm can be found in Slurm Accounting.

Information about Submitting a GPU job can be found in Submitting a GPU job.

zuletzt geändert am 20.10.2023

Wie hat Ihnen dieser Inhalt geholfen?

Creative Commons Lizenzvertrag
Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Deutschland Lizenz