Sie befinden sich im Service: RWTH Compute Cluster Linux (HPC)

Usage of Singularity

Usage of Singularity

Kurzinformation

Singularity is a container virtualization software specifically designed for HPC environments. You may imagine containers as lightweight virtual operating systems with preinstalled and preconfigured software that can be run just like any other program on a host system. You could, for example, run software in an Ubuntu environment within our CentOS compute cluster. This helps overcome portability issues with software that has very specific dependencies or was not built to be run under RHEL-based distributions.

 

Please note:

We are currently testing Singularity with selected use cases. If you need to run software on the cluster that profits from containerization and are interested in using Singularity, please contact the IT-ServiceDesk (servicedesk@itc.rwth-aachen.de). Software that has already been containerized for Docker can often be ported to Singularity with virtually no effort.


If you want to run software via Singularity, please read the Best Practices first!


 Detailinformation

Container Environment

Containers allow software developers and users to package software and its dependencies in a virtual environment that can easily be ported to completely different systems. Not only does this eliminate the need for complex software installation, it also makes results received through the containerized software reproducible. Computations are always run on the exact same software versions.


Usual containerization software isolates applications from the host system. This poses problems in the context of HPC because users are often running jobs across multiple nodes using special interconnects that need software support. Singularity, however, allows containerized multi-node MPI jobs and leveraging the Intel OmniPath fabric. Whilst you do not have access to the host operating system, within the container you may still do the following things:

  • Access all your personal directories ($HOME, $WORK, $HPCWORK). Within most containers you may access these via the aforementioned variables as usual. You may thus comfortably share data between containerized and native applications.
  • Access other nodes through the network and run multi-node jobs.
  • Access GPUs

In the following paragraphs, container image refers to the files that are used to run a container whereas container refers to the executed image.

 

Run a container

There are three standard ways of running a singularity container: The shell subcommand, the run and the exec subcommand.

  • The shell subcommand allows you to start an interactive shell within the container and is intended for use on frontends or within interactive batch jobs.
     
  • The exec subcommand allows users to run custom shell commands in a containerized shell. This is the default way of using Singularity within batch scripts and can be coupled with a separate exec script.
     
  • The run subcommand triggers the execution of a pre-defined runscript within the container. Container creators may choose to provide a default workflow which can be accessed this way.


Examples of starting a container

singularity shell $HOME/my_container
singularity run $HOME/my_container
singularity exec $HOME/my_container cat /etc/os-release
singularity exec $HOME/my_container $HOME/my_exec_script.sh

Please be aware that a shell script to be run inside a singularity container needs a shebang referring to an interpreter that is available inside that container, e.g. scripts starting with #!/usr/local_rwth/bin/zsh will not work. Most of the times you will want to default to #!/bin/bash unless another shell is available inside the container.

 

Available Singularity Containers

We provide several selected software modules for use with Singularity beneath the CONTAINERS tree. To load them, please follow this procedure:

module load CONTAINERS
module load tensorflow
singularity shell $R_CONTAINER

As usual, you may get an overview of all available modules via "module avail".

 

Limitations

  • Software installed on the host system cannot be used within a container (most notably, the module system)
     
  • Container images can only be executed if they reside within special paths. This means you cannot run custom containers on your own!
     
  • We only support directory-based Singularity containers ("sandbox directories") and no regular SIF container files.
     
  • Users cannot build containers on the cluster. Builds have to be done on separate machines, remote builders or containers have to be downloaded pre-built.


 

 

Singularity and MPI

Singularity supports two models for MPI usage. The first, which we very strongly recommend you use, uses the host MPI implementation to handle the actual communication between processes and utilization of the Intel Omni-Path interconnect between compute nodes. The container must simply contain binary-compatible MPI libraries, which often comes down to installing a similar or even the same MPI implementation in the container. This is called the "hybrid model". Notably, it allows you to use our regular intelmpi and openmpi modules for use with your container. Another option is the "bind model" which involves binding the host MPI implementations into the container. This requires additional effort and should not be used unless necessary.

Using the hybrid model is very easy if the containerized MPI version is supported. Instead of running the MPI wrapper ($MPIEXEC) inside the container, you run the container inside the wrapper, i.e. "$MPIEXEC singularity run my_container". For an example batch script, see down below.


 

GPU Usage in Containers

Providing access to GPUs inside containers is a non-trivial task. Luckily, Singularity supports this with a simple command line argument. To use NVidia GPUs simply add the "--nv" option after your desired subcommand. Let's assume we want to execute the tensorflow-gpu module:

module load CONTAINERS
# change version if needed
module load tensorflow/2.4.1-gpu
singularity exec --nv $R_CONTAINER $HOME/my_tensorflow_script.sh


Naturally the --nv flag will only work correctly on systems that actually have a GPU installed. If run on a non-GPU host, Singularity will issue a warning but still execute the container.

Singularity will use the host's CUDA installation where possible. This works well for a lot of applications that support a recent CUDA version.

 
 

Example Batch Scripts
 

Serial Example

#!/usr/local_rwth/bin/zsh

### Job name
#SBATCH --job-name=SINGULARITY_EXAMPLE

### File / path where STDOUT will be written, %J is the job id
#SBATCH --output=singularity-job-out.%J

### Request the time you need for execution. The full format is D-HH:MM:SS
### You must at least specify minutes or days and hours and may add or
### leave out any other parameters
#SBATCH --time=30

### Request memory you need for your job in MB
#SBATCH --mem-per-cpu=2000

### Request number of hosts
#SBATCH --nodes=1

### Request number of CPUs
#SBATCH --cpus-per-task=4

### Change to the work directory
cd $HOME/jobdirectory

### Execute the container
### myexecscript.sh contains all the commands that should be run inside the container

singularity exec /path/to/my/container $HOME/myexecscript.sh

 

MPI Example

#!/usr/local_rwth/bin/zsh

### Job name
#SBATCH --job-name=SINGULARITY_MPI_EXAMPLE

### File / path where STDOUT will be written, %J is the job id
#SBATCH --output=singularity-job-out.%J

### Request the time you need for execution. The full format is D-HH:MM:SS
### You must at least specify minutes or days and hours and may add or
### leave out any other parameters
#SBATCH --time=30

### Request memory you need for your job in MB
#SBATCH --mem-per-cpu=2000

### Request number of tasks/MPI ranks
#SBATCH --ntasks=4

### Change to the work directory
cd $HOME/jobdirectory

### Execute the container
### myexecscript.sh contains all the commands that should be run inside the container

$MPIEXEC singularity exec /path/to/my/container $HOME/myexecscript.sh


Special problems with shared $HOME directories

Not only do you have access to your home directory within the container but it will also, by default, serve as the container's home directory for every container that you execute. This means that configuration files stored within your home directory, such as application config files (zsh!) will be used within the container as well. This can prove both advantageous and disadvantageous since a shared configuration may make working within the container more comfortable but at the same time introduce settings that are incompatible with the containerized environment. 
Shell-based compatibility issues are mitigated by Singularity's default behavior of invoking containers with /bin/sh. You may invoke another shell by specifying its path via the "--shell" argument. The shell needs to exist within the container image which is usually the case for bash but not for zsh. 
Python software within containers should make use of virtual environments or package managers like conda to avoid hard-to-trace side effects.
 

If you wish to use an empty home directory within a container instead, please add the "--no-home" flag to your container invocation. This requires you to start the container from a path that is not within your home directory. You can also use a different directory as your temporary home directory via "--home /path/on/host".

 

Converting Docker Images for Singularity

Please note: 

The following procedure can be used to download a docker container and turn it into a directory-based Singularity container. In order to execute the container, you will have to contact servicedesk@itc.rwth-aachen.de

 

Pull Docker Image From External Resource

Singularity's pull command allows pulling arbitrary docker containers and converting them to Singularity containers in a single step. Container registries or software documentation will often explain how to retrieve a container like so:
 

docker pull nvcr.io/nvidia/tensorflow:20.06-tf2-py3


This tells docker to pull the container "tensorflow" in version "20.12-tf2-py3" from the NVidia container registry. You can create a sandbox container from a docker resource without any special privileges like this:
 

singularity build --sandbox tensorflow_20.12-tf2-py3 docker://nvcr.io/nvidia/tensorflow:20.12-tf2-py3


The prefix "docker://" tells singularity that the following URI points to a docker image and should be treated as such.

 

Pull Image From Nvidia Container Registry

This snippet shows the full process from pulling to executing an image from the NVCR.

# Pull Tensorflow 20.12
singularity build --sandbox tensorflow_20.12-tf2-py3 docker://nvcr.io/nvidia/tensorflow:20.12-tf2-py3

# Inspect a container with an interactive shell and GPU support
singularity shell --nv tensorflow_20.12-tf2-py3

# Execute a predefined script in a container with GPU support, e.g. within SLURM
singularity exec --nv tensorflow_20.12-tf2-py3 ./myexecscript.sh

 

Build Singularity Images on top of Docker Images

Please note: 

You need elevated privileges, i.e. the ability to run singularity as root, to build containers. Therefore, users can not build singularity recipes on the cluster but have to resort to other machines or remote builders.

Singularity supports building containers on top of docker images via the "docker" bootstrapping option. A stub for this purpose would look like this:

Bootstrap: docker
From: ubuntu:18.04

# Use the image "ubuntu/18.04" from the docker registry as the foundation for this container

%post
    export DEBIAN_FRONTEND=noninteractive
    apt-get update
    apt-get install -y extra_packages
    rm -rf /var/lib/apt/lists/*

%help
    An example of how to bootstrap a singularity container

 
 

 Best Practices for Singularity Usage

Singularity's default workflow to carry over the host process's environment and employing the user's host home directory can lead to several problems that can be hard to debug for unwary users. Please make sure that you implement these best practices where applicable to your use case.

  • Handling of Modules
    • Most modules are not supported by Singularity containers with MPI modules as a notable exception. Loading modules changes a shell's environment and these changes are carried over to a Singularity container invoked within this shell. This does not necessarily break things within the container as long as the environment variables changed by the module are not used by the containerized software. In some instances such as compiler modules, however, these changes may cause software to break, e.g. the C compiler variable CC being set to "icc" (the Intel compiler binary). To avoid such issues we recommend unloading all modules that are not needed for the container before starting Singularity. If your program does not rely on MPI, you may use "module purge".
       
  • Using Compatible MPI Versions
    • Containers run via MPI need to be provided with a compatible MPI implementation. This can usually be achieved by choosing a compatible version from our module tree, loading the module and starting Singularity via the proper MPI wrapper (see above for an MPI batch example). If the container has been provided by a third party it should contain information on the MPI version against which the program was linked. It should be noted that for OpenMPI versions below 3.0.0 compatibility is only guaranteed for versions matching exactly.
       
  • Running a Container Without $WORK or $HPCWORK
    • By default both $WORK and $HPCWORK are mapped into each container run on the cluster giving you access to all personal directories you would have access to in a native environment. In case of filesystem problems this prevents the startup of containers which may be undesired if either of the filesystems is not used by a job. To circumvent this, consider the following line:
      singularity shell --contain --bind=/dev,$HOME,$WORK my_image
      The --contain flag disables binding of many host paths and the subsequent bind argument rebinds all relevant directories. In this case we have excluded $HPCWORK. If you want to start a container without $WORK, you can simply exchange the $WORK bind with $HPCWORK.
       
  • Running Python Software in a Container
    • In general containers images will provide all necessary Python modules in a default system path which normally takes precedence over any locally installed modules. In this case potentially conflicting module installations within your $HOME directory will not cause any problems. However, some images - mainly those distributed for Docker - might store modules in a custom location that needs to be added to the PYTHONPATH environment variable in order to use the image as intended. You may find further information on this topic here. A properly configured Singularity image will take care of such issues by setting the PYTHONPATH accordingly. If this is not being done, you should make sure to start the container with an empty PYTHONPATH variable, e.g. by executing "unset PYTHONPATH".

      You should abstain from any software installations to default locations while inside a container as this can easily break existing or upcoming software installations. The PIP tool should only be used if you are fully aware of the consequences and if you see the need to use it, you are probably doing something wrong. Likewise sharing venv or conda environments between containers and the host system is almost guaranteed to lead to problems.
 

Common Errors

  • "exec: ...: a shared library is likely missing in the image"
    • This error can be caused by numerous issues:
      • You are trying to execute a script that uses an invalid shebang (any scripting language). Please make sure that the path in your shebang, e.g. #!/bin/bash, is indeed available in your container. 
      • You are trying to execute a python script that relies on modules which have not been installed in your container. In this case please see "Running Python Software in a Container" above.
 

Further Questions

If you have any questions that were not (fully) answered above or have any suggestions for improvements, please contact us via servicedesk@itc.rwth-aachen.de . If your questions regard Singularity itself, you may find the official User Guide helpful.

zuletzt geändert am 24.08.2021

Wie hat Ihnen dieser Inhalt geholfen?

Creative Commons Lizenzvertrag
Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Deutschland Lizenz