IT Center Help

You are located in service: RWTH High Performance Computing (Linux)

Choosing compute node types for my project

Kurzinformation

Hints for the selection of the compute node type for your project

Detailinformation

Hints for the selection of the compute node type on CLAIX (with Slurm)

For your free quota, you can use the CLAIX-18 nodes.

For your project, you can use the partitions your project is configured for. New projecs are typically configired for CLAIX-18 nodes, old/prolongued porjects for the nodes on which the project already run on. You also can ask for additional partitions if needed; do not forget to motivate your request. Test for which partitions your project is configured for: Slurm Accounting

CLAIX-2018-MPI

For all compute projects batch jobs are directed to their primary compute node type. For most compute projects the primary compute node type is set to CLAIX-2018-MPI.

The characteristics of this node type are:

2 Intel Xeon Platinum 8160 Processors “SkyLake” (2.1 GHz, 24 cores each) and thus 48 cores per node,
192 GB main memory per node (~4 GB main memory per core)

CLAIX-2018-GPU

A group of CLAIX-2018-GPU nodes are configured like CLAIX-2018-MPI nodes plus they are equipped with 2 NVIDIA Tesla V100 GPUs each.
NVLINK is employed to link these 2 GPUs with each other. Each GPU provides 16 GB HBM2 memory.

There is a rather constellation due to a softare bug on which the older P100 GPU may be faster than V100 GPU, but typically the newer GPU would be faster for your code.

The CLAIX-2018-GPU nodes are open for free/test quota usage and for all projecs configured for CLAIX-18 cluster.

Submitting of batch jobs on GPU cluster: GPU batch mode

24 hrs versus 120 hrs max job runtime

It is desirable that compute jobs do not run "forever", mainly for the following reasons:

1. If there is some kind of system crash or a long running job terminates abruptly, a lot of compute cycles will be wasted. Therefore, using software able to write checkpoint files every few hours and restart from those checkpoints is a very good idea. Restarting the job from the most recent checkpoint instead from the beginning reduces the loss to a reasonable amount.

2. Every now and then, it is necessary that the system administrators schedule a downtime for maintenance and upgrades. Obviously, long running jobs are an obstacle.

3. Long running jobs disturb the good mixture of jobs from many users, a prerequisite for everyone to get a fair share of the system resources and make decent progress. They also conflict with the scheduling of large parallel jobs and lead to long waiting times and bad overall system usage.

Therefore, the maximum runtime for jobs has been set to 24 hrs like at many other HPC sites. Jobs that can run on up to four nodes may run up to 120 hrs (5 days).

Note: For applications which do not support sufficient check pointing, we might approve jobs running for more than 120 hours. Please explicitly explain why you have this requirement in your project application. In any case exceptional approvals are coupled to the following conditions:

Jobs with more than 120 hours must not use more than 4 compute nodes.
The IT Center reserves the right to kill the job after 120 hours for maintenance reasons.

Job submission

The batch system is usually able to correctly determine the kind of compute node from the job resource requirements, within of allowed partitions. Thus for the GPU jobs the partition will be set automatically, see GPU batch mode

Within of your set of allowed partitions (see Slurm Accounting and Hardware of the RWTH High Performance Computing) you can manually set the partition by

#SBATCH -p c18s

Zusatzinformation

Overview of the available Hardware: Overview

last changed on 10/20/2023

How did this content help you?

This work is licensed under a Creative Commons Attribution - Share Alike 3.0 Germany License