You are located in service: RWTH High Performance Computing (Linux)

First Steps in Submitting a Job

First Steps in Submitting a Job

This guide provides a short tutorial on how to submit computing jobs to the workload scheduler Slurm.

Creating Batch Scripts

To submit jobs, you need to create a batch script containing both your computing job parameters and the executable code. You can create and edit these scripts in two ways:

  • Graphically: Connect to the cluster using FastX, where you can utilize graphical editing tools.
  • Command-Line: SSH into the cluster and use a command line editor such as vim or nano.

The following section shows the batch script for a simple MPI job:

batch_script.sh
1
2
3
4
5
6
7
8
9
10
#!/usr/bin/zsh 

### Job Parameters 
#SBATCH --ntasks=8
#SBATCH --time=00:15:00
#SBATCH --job-name=example_job
#SBATCH --output=stdout.txt

### Program Code
srun hostname

The batch script is comprised of three parts:

  • The first line is always a shebang that defines the interpreter. The cluster only supports the zsh shell.
  • The next lines specify the job parameters (here lines 4 - 7). Each line starts with #SBATCH followed by the option that you want to set. In this case a job with 8 tasks and a run time of 15 minutes is requested. In addition, the name of the job is set to example_job and STDOUT and STDERR are written to stdout.txt.
  • As soon as a non-comment line is encountered, job parameters are no longer read in and the execution of the job is started (here line 10). In this job, each MPI rank prints their hostname.

Submitting the Computing Job

After specifying your batch script, you need to login via SSH to submit your job. Navigate to the directory where you created the batch script and use the following command to submit it to the job to the Slurm queue:

> sbatch batch_script.sh
Submitted batch job 12345678

After submitting your job script, you will obtain a job id from Slurm, which can be used to manage your job. To check the current state of your job use:

> squeue --me
             JOBID PARTITION        NAME      USER ST       TIME  NODES NODELIST(REASON)
          12345678     c23ms example_job  AB123456  R       0:02      1 n23m0023

Most importantly, ST shows the current state of your job. PD stands for pending and R for running. When your job is no longer visible in the queue list, then it has completed.

Note: The directory from which sbatch is called, will be set as working directory.


​See also

 

last changed on 05/14/2024

How did this content help you?

Creative Commons Lizenzvertrag
This work is licensed under a Creative Commons Attribution - Share Alike 3.0 Germany License