Sie befinden sich im Service: RWTH Compute Cluster Linux (HPC)

SLURM Accounting

SLURM Accounting

Kurzinformation

  1. Accounting Value
  2. MAXTRES, TRESBillingweights and billing
  3. Processing of accounting data

Detailinformation 

1 - Accounting value

The accounting value is called one "corehour", which is equivalent to one hour wall-clock time per used core (or better per billing value, explained later on).

 

2 - MAXTRES, TRESBillingweights and billing

We use a feature of SLURM, which is called MAXTRES.
What are TRES? That are the so called TRackable RESsources, which means ressources, that are tracked and therefore are also accounted.

With MAXTRES, it is possible to do a fair accounting of compute nodes. SLURM uses a billing value for that. Lets do a simple example:

We have one node, which has 10 cores, 100 GB memory and 2 GPU cards inside, which means 10 GB are worth 1 core (a tenth of the node) and 1 GPU is worth 5 cores (half the node).

  • My job uses 1 core, 1 GB memory and no GPU -> the billing value will be 1. So, if my job runs one hour and the billing factor is 1, I will get accounted 1 corehour.
  • My job uses 10 cores, 1 GB memory and no GPU -> the billing value will be 10. So, if my job runs 2.5 hours, I will get accounted 25 corehours.

So far, this is as usual, as it was also with LSF. Now some more examples:

  • My job uses 1 core, 1 GB memory and 1 GPU -> the billing value will be 5. It used just one core, but only one other job can use the node with a gpu, so half of the node is blocked.
  • My job uses 1 core, 90 GB memory and no GPU -> the billing value will be 9, since 90% of the memory is used.
  • My job uses 7 cores, 80 GB memory and 1 GPU -> the billing value will be 8, since 80% of the memory is used, only 7 cores are used and only 1 gpu, which is worth 5. The MAXTRES in this case is the memory and thus its value is used for the billing.

To achieve this, there are so called TRESBillingWeights associated to each partition, such that the aforementioned fair billing is done for the different node types.

 

3 - Processing of accounting data

Each night, the relevant data of all jobs running the day before gets extracted from the slurm database. The accounting data is truncated to that day. That means

  • a job that started and ended that day is accounted as normal
  • a job that started the day before and ended that day will be accounted the fraction of that day.
  • a job that started that day and is still running will be accounted the fraction of that day.
  • a job that started the day before and is still running will be accounted that whole day.

This prevents surprises like such with LSF, when there was a big, long running job, e.g. 1000 cores for 120 hours. When the job ended, that day was accounted with 120k corehours, This job would now be accounted with 24k corehours every single of the five days.

The acchieved data is then postprocessed, such that r_wlm_usage has fast access to the accounting data.

 

zuletzt geändert am 29.01.2021

Wie hat Ihnen dieser Inhalt geholfen?

Creative Commons Lizenzvertrag
Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Deutschland Lizenz