You are located in service: RWTH High Performance Computing (Linux)

File Systems Overview

File Systems Overview

Introduction

Each HPC user, along with their compute project, is allocated storage space on the HPC cluster to facilitate their research and computational tasks. The cluster offers various file systems, each with its unique advantages and limitations. This page provides an overview of these storage options.

Key Points:

  • The appropriate file system for any workflow depends on the task at hand. Understanding the differences between the file systems described on this page helps with this evaluation. In addition, the decision tree below may also help in the decision process.
  • Users are primarily responsible for backing up their data. The HPC cluster is not intended as a long-term data storage solution. It is crucial to regularly backup data to avoid any potential loss.
  • Additional storage space for compute time projects can be requested by following the guidelines provided here.
  • Instructions on how to check the available quota are detailed here.

The three permanent file systems are: $HOME, $WORK, and $HPCWORK. To navigate to your personal storage space, simply use the cd command followed by the file systems’ name. For example, cd $WORK will take you to your storage space on $WORK. For a specific compute time project, use cd /home/<project-id> to access a project's space in the $HOME partition, cd /work/<project-id> for the $WORK partition, or cd /hpcwork/<project-id> for the $HPCWORK partition.

Overview

$HOME

Upon logging into the HPC Cluster, users are directed to their personal home-directory within $HOME. As a Network File System (NFS) with an integrated backup solution, this file system is particularly well-suited for the storage of important results that are challenging to reproduce, as well as for the development of code projects. However, $HOME is less suitable for running large-scale compute jobs due to its limited space of 150 GB. In addition, frequent and massive file transfers, creation, and deletion can put significant strain on the backup system.

$WORK

$WORK shares some similarities with $HOME, as it is also operated on an NFS. However, the key difference is that $WORK has no backup solution. This absence of backups allows for a more generous storage quota of 250 GB, and there is greater flexibility in expanding storage for compute projects if needed.

$WORK is particularly suitable for compute jobs that are not heavily dependent on I/O performance and that generate numerous small files.

$HPCWORK

The $HPCWORK file system is based on Lustre, which allows for larger storage space and improved I/O-performance compared to $HOME and $WORK. Each user and compute project are granted a default quota of 1000 GB on this file system. In addition, the system can handle extremely large files and fast parallel access to them.

These benefits are possible, in part because the metadata of files is stored in metadata (MD) databases and handled by specialized servers. However, each file, regardless of its size, occupies a similar amount of space in the MD database. To maintain a managable amount of MD database entries for each user and compute poject, there is also a quota on the number of files on $HPCWORK. The default file quota is set to 50,000 files.

Mind that $HPCWORK also provides no backup solution.

$BEEOND

For compute jobs with high I/O performance demands, users can leverage the internal SSDs of the compute nodes. The BeeOND (BeeGFS on Demand) temporary file system enables users to utilize the SSD storage across all requested nodes as a single, parallel file system within a single namespace.

Key Considerations when using BeeOND:

  • The amount of allocated storage depends on the type and number of requested nodes (for more information click here).
  • Compute jobs that will become exclusive.
  • Within the job, the file system path for BeeOND is accessible via the environment variable $BEEOND.
  • The storage space on the filesystem is strictly temporary! All files will be automatically deleted after the compute jobs concludes.

This example job script shows how to use BeeOND:

#!/usr/bin/zsh

### Request BeeOND
#SBATCH --beeond
### Specify other Slurm commands

### Copy Files to Beeond
cp -r $WORK/yourfiles $BEEOND

### Navigate to Beeond
cd $BEEOND/yourfiles

### Perform your job
echo "hello world" > result

### Afterwards copy results back to your partition
cp -r $BEEOND/yourfiles/result $WORK/yourfiles/

Decision Tree

If you are unsure which file system to use for your compute jobs, the following decision tree may help:

Summary

In the following table you can see a summary of all and additional details of the available file systems discussed before.

File SystemTypePathPersistenceSnapshotsBackupQuota (space)Quota (#files)Use Cases
$HOMENFS/CIFS/home/<username>permanent$HOME_SNAPSHOTyes150 GB-

Source code,

configuration files,

important results

$WORKNFS/CIFS/work/<username>permanent$WORK_SNAPSHOTno250 GB-Many small working files
$HPCWORKLustre/hpcwork/<username>permanent-no1000 GB50,000

I/O intensive compute jobs

large files

$BEEONDBeeONDstored in $BEEONDtemporary-nolimited by the sum of sizes of local disks-

IO intensive compute jobs,

many small working files,

any kind of scratch data

 

Additional Information

Visibility of Filesystem

Each user directory is only mounted when a process actually accesses it. Therefore, you might not see a specific user directory in a listing of /home, /work or /hpcwork (which can be confusing, especially if you are using a graphical file manager). This does not mean that a user directory does not exist, but that you might have to type the path to a user directory explicitly to actually get there. Any action that causes an access to your directory will mount it and thereby make it visible. This can be done by changing into the directory in a terminal or typing the full path of your directory in the address bar of your file manager.

Backup Exclusion

The following files and directories are currently excluded from the backup in $HOME:

  • Complete sub-directories: 
    • .NOBACKUP
    • ~/.cache
    • ~/.comsol/*/configuration
    • ~/.Trash*
    • ~/.local/share/Trash*
  • File patterns:
    • core.*.rz.RWTH-Aachen.DE.[1-9]*.[1-9]*
    • core.*.hpc.itc.rwth-aachen.de.[1-9]*.[1-9]*

Snapshots

Snapshots reflect the state of a file system at previous points in time. By changing to a snapshot directory, ($HOME_SNAPSHOT or $WORK_SNAPSHOT), you can access previous versions of your files. The files within the snapshots are read-only, they can not be altered or deleted. Please note that the snapshots creation policy is subject to change. If space gets short, we may decide to create less snapshots, delete existing snapshots or omit them completely. Snapshots are not an alternative to a backup. If a file system gets damaged, all snapshots are lost, too.

$TMP

In case you are using a sinlge node exclusively and would like to use the local SSD, BeeOND can cause some unneccessary overhead. In this case you can directly access the SSD via the path stored in $TMP.

We do not reccomend using the local SSD storage on shared nodes, as there are no restrictions on how much space each user can occupy. It is therefore unpredictable if storage space on the local SSD is available.

last changed on 02/15/2024

How did this content help you?

Creative Commons Lizenzvertrag
This work is licensed under a Creative Commons Attribution - Share Alike 3.0 Germany License