IT Center Help

Sie befinden sich im Service: RWTH High Performance Computing (Linux)

Slurm Accounting

Kurzinformation

Detailinformation

Usage of HPC resources is accounted (billed) in core-hours. One core-hour equals one CPU core being used for the duration of one hour of execution time. The time is measured as the jobs elapsed wall clock time from start to finish.

Quota

Each user has a quota that allows them to use a defined amount of core-hours depending on the projects available:

For project "default":
- students/trainees have a monthly quota of 500 corehours
- all other users have a monthly quota of 2000 corehours
- if you used more than 3 times that value in the last 4 weeks, you will get a hint regarding compute projects during submission
- if you used more than 15k corehours in the last 4 weeks, you will get a warning, that submission will be disabled soon
- if you used more than 36k corehours in the last year, you will get a warning, that submission will be disabled soon
- if you used more than 20k corehours in the last 4 weeks, further submission will be disabled until the used corehours in the last 4 weeks get below that limit
- if you used more than 48k corehours in the last year, further submission will be disabled until the used corehours in the last year get below that limit
For all other projects:
- the monthly and total quota for a project can be looked up via "r_wlm_usage -q"
- if the project used more than 1 times its total quota, you will get a warning, that submission will be disabled soon
- if the project used more than 4 times the monthly quota within the last 4 weeks, you will get a warning, that submission will be disabled soon
- if the project used more than 2 times its total quota, further submission will be disabled
- if the project used more than 6 times the monthly quota within the last 4 weeks, further submission will be disabled until the used corehours in the last 4 weeks get below that limit

Monitoring quota utilization

You can see your core-hours quota using the r_wlm_usage [-p <project>] -q command. Learn more about the command here.

Sliding window

Quota can be used within a three months wide sliding window. Each month a limited monthly allowance of compute time is granted. Unused quota from the previous month, up to the monthly allowance, is transferred automatically to the current month but not further.

The core-hours quota available for use in the current month is computed as follows:

The monthly allowance for the previous, the current, and the next month are added together.
The consumed core-hours for the previous and for the current month are added together.
The difference between both values is the amount of core-hours available in the current month.

Jobs will be queued with low priority once the quota has been fully consumed. Further job submission is prevented as soon as one of the following conditions hold:

The monthly quota was exceeded by a factor of 6.
The total project quota was exceeded by a factor of 2.

Billing and MAXTRES, TRESBillingweights

We use a feature of Slurm, which is called MAXTRES. What are TRES? Thay are the so called TRackable RESources, which means resources, that are tracked and therefore accounted.

With MAXTRES, it is possible to do a fair accounting of compute nodes. Slurm uses a billing value where cores cores are the basic unit. Lets do a simple fake example:

Let us suppose we have a fake node with 10 cores, 100 GB memory and 2 GPUs. This means there are 10 GB per core, or that 10 GB are worth 1 core (a tenth of the node). This also means that there is 1 GPU per 5 cores. So 1 GPU is worth 5 cores (half the node). Note that is is a fake example, for the real values please check the billing on this table.

If my job uses 1 core, 1 GB memory and no GPU -> the billing value will be 1. So, if my job runs one hour and the billing factor is 1, I will get accounted 1 corehour.
If my job uses 10 cores, 1 GB memory and no GPU -> the billing value will be 10. So, if my job runs 2.5 hours, I will get accounted 25 corehours.
If my job uses 1 core, 1 GB memory and 1 GPU -> the billing value will be 5. It used just one core, but only one other job can use the node with a gpu, so half of the node is blocked and therefore half the node is billed.
If my job uses 1 core, 90 GB memory and no GPU -> the billing value will be 9, since 90% of the memory is used.
If my job uses 7 cores, 80 GB memory and 1 GPU -> the billing value will be 8, since 80% of the memory is used, only 7 cores are used and only 1 gpu, which is worth 5. The MAXTRES in this case is the memory and thus its value is used for the billing.

To achieve this, there are so called TRESBillingWeights associated to each partition, such that the aforementioned fair billing is done for the different node types.

Processing of accounting data

Each night, the relevant data of all jobs running the day before gets extracted from the slurm database. The accounting data is truncated to that day. That means

a job that started and ended that day is accounted as normal
a job that started the day before and ended that day will be accounted the fraction of that day.
a job that started that day and is still running will be accounted the fraction of that day.
a job that started the day before and is still running will be accounted that whole day.

This prevents surprises like such with LSF, when there was a big, long running job, e.g. 1000 cores for 120 hours. When the job ended, that day was accounted with 120k corehours, This job would now be accounted with 24k corehours every single of the five days.

The archived data is then postprocessed, such that r_wlm_usage has fast access to the accounting data.

zuletzt geändert am 23.02.2024

Wie hat Ihnen dieser Inhalt geholfen?

Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Deutschland Lizenz