IT Center Help

Sie befinden sich im Service: RWTH High Performance Computing (Linux)

Development on Login-Nodes

The CLAIX cluster offers specialized login-nodes for development. This setup allows you to develop and compile code without the need to queue jobs via SLURM. However, it’s important to remember that resources are shared based on a fair-use principle, and usage limits are enforced to prevent exploitative use. To monitor your processes, you can use the command top -u $USER.

For tests of non-GPU applications, please submit a job to the devel partition.

All common compilers are available through the module system.

Development on login23-g-1

The node login23-g-1 is equipped with four H100 NVIDIA GPUs, designated for development and testing purposes.

The GPUs are configured to support a broad range of tests and accommodate as many users as possible simultaneously. Note that the GPU settings may change based on user feedback.

Gathering Information about GPUs

To gain insights into the operations of the GPU, you can use the command

nvidia-smi

This will show you information about the GPUs, as well as the load, occupancy and running processes. Most importantly, this command will also show the operational mode of the GPUs (see below).

Selecting specific GPUs

To select a GPU or MIG-instance, set the environment variable as follows:

export CUDA_VISIBLE_DEVICES=<GUID>

To select multiple GPUs provide a comma-separated list. The GUIDs can be retrieved using the command:

nvidia-smi -L

Recommendations

Reserve GPU Memory: When using a non-exclusive GPU, it is advisable to reserve the amount of GPU memory you anticipate needing for your application. Otherwise, you may unexpectedly run out of memory and your application may crash.
Multi-GPU-Tests: To test multi-GPU applications, select two GPUs that are not partitioned into MIG instances. It is not possible to use two MIG-instances or to combine a MIG-instance with a different GPU.