Sie befinden sich im Service: RWTH Compute Cluster Linux (HPC)

Best Practices

Best Practices

Kurzinformation

  • Running multiple similar jobs which only differ in terms of the input often is required in the context of parameter study / sensitivity analysis. Array jobs are recommended for this scenario.
  • Long computations should be divided into shorter steps, creating a chain job.
  • In order to avoid pitfalls and finding errors after waiting hours for a job to start, the following procedure for setting up batch jobs has been judged as useful:
    • Avoid loading modules in your .zshrc or .bashrc.
    • Test your application whether it starts as expected on a frontend node.
    • Put all commands necessary for the program to start into a script file.
      • Don't forget the module commands and the shebang (#!) at the first line of the script:

        #!/usr/local_rwth/bin/zsh

      • Use relative paths and ensure that you know the directory in which the job starts.
    • Make the script executable (chmod 775 myscript.sh), run it (./myscript.sh) and make sure that everything works as expected.
      • Put the memusage script in front of the executable name (e.g. r_memusage --vm who am i), run it for a minute and then kill the process (Ctrl+C). That way you can get an idea of how much memory the program uses.
      • Ensure that you have configured your program's memory / stack limits when necessary.
    • Put all job requests in the script file using the #SBATCH magic cookie.
      • Pay attention to the limiting factors such as: memory, number of slots and job run time.
    • Submit your script with a run time of 5 minutes so that it has greater chance of failing early. If everything starts as expected,
    • Submit your final script.

zuletzt geändert am 29.01.2021

Wie hat Ihnen dieser Inhalt geholfen?