Sie befinden sich im Service: RWTH High Performance Computing (Linux)

Data Transfer with rclone

Data Transfer with rclone

guide

You can use the software rclone to copy or synchronize your data between CLAIX and other HPC systems.


Rclone Configuration

On CLAIX, rclone is accessible on the copy23-1 and copy23-2 systems. If rclone is not available in the HPC systen you want to transfer the data to or from, it can be installed as described in the rclone documentation.

Configuration example:

rclone config
 
No remotes found, make a new one?
n) New remote
s) Set configuration password
q) Quit config
n/s/q>n
Enter name for new remote.
name> rclonetargetname
 
Option Storage.
Type of storage to configure.
Choose a number from below, or type in your own value.
 1 / 1Fichier
   \ (fichier)
 2 / Akamai NetStorage
   \ (netstorage)
...
50 / SSH/SFTP
   \ (sftp)
...
Storage>50
 
Option host.
SSH host to connect to.
E.g. "example.com".
Enter a value.
host>hpcstorage.example.de
 
Option user.
SSH username.
Enter a value of type string. Press Enter for the default (johndoe).
user>johndoe
 
Option port.
SSH port number.
Enter a signed integer. Press Enter for the default (22).
port>Press Enter
...
SSH password, leave blank to use ssh-agent.
n) No, leave this optional password blank (default)
y/g/n>Press Enter
 
Option key_pem
key_pem>Press Enter
 
Option key_file
key_file>/home/johndoe/.ssh/id_ed25519
 
Option key_file_pass.
...
n) No, leave this optional password blank (default)
 
y/g/n>Press Enter
 
Option pubkey.
pubkey>Press Enter
 
Option pubkey_file.
pubkey_file>Press Enter
 
Option key_use_agent.
key_use_agent>true
 
Option use_insecure_cipher.
use_insecure_cipher>Press Enter
 
Option disable_hashcheck.
disable_hashcheck>Press Enter
 
Option ssh
ssh>Press Enter
 
Edit advanced config?
y/n>Press Enter
 
Keep this "rclonetargetname" remote?
y/e/d>Press Enter
 
Current remotes:
rclonetargetname     sftp
...
q) Quit config
e/n/d/r/c/s/q>q

As a result, you get an rclone.conf file that contains the following information:

[rclonetargetname]
type = sftp
host = hpcstorage.example.de
user = johndoe
key_file = /home/johndoe/.ssh/id_ed25519
key_use_agent = true

for the RWTH cluster in particular:

[targetcluster]
type = sftp
host = copy23-2.hpc.itc.rwth-aachen.de
user = maxmustermann8
key_file = /home/maxmustermann8/.ssh/id_ed25519
key_use_agent = true

The configuration file is saved under the following path:

rclone config file
/home/johndoe/.config/rclone/rclone.conf

Preparing for file transfer

Before you start:

  • Check if you have an ssh-key
    • The transfers should be carried out with an ssh-key. The ssh-key must be protected by a secure passphrase.
  • Check quotas on both clusters
    • with r_quota for CLAIX
  • Change to a suitable file system
    • e.g. cd $HPCWORK
  • Transferring a few large files is significantly faster than transferring many small ones. Therefore, please consolidate smaller files into larger archives whenever possible (e.g., using tar -cf tar_archive.tar data_to_transfer).

Copying data from another cluster to CLAIX

The following example shows how you can copy data from another cluster to CLAIX.

1. Connect to the respective node:

ssh copy23-1.hpc.itc.rwth-aachen.de

or

ssh copy23-2.hpc.itc.rwth-aachen.de

2. Check if an ssh-key is available in the ssh agent:

ssh-add -l

You can add the ssh-key using the following command or by acivating Agent forwarding:

ssh-add /home/maxmustermann8/.ssh/johndoe_id_ed25519

3. List you data:

rclone lsd rclonetargetname:

4. Copy the data from rclonetargetname into the traget directory (e.g. cd $HPCWORK/dir):

rclone copy --multi-thread-streams=4 --ignore-checksum rclonetargetname:/path/ ./

Please note:
In this case, the number of threads (--multi-thread-streams) should not be increased further. Depending on the number of files to be transferred, 16 files (= multi-thread-stream * transfers) are already being transferred simultaneously in this case.

Copying data to another cluster from CLAIX

The following example shows how you can copy data from CLAIX to another cluster.

1. Connect to the respective node:

ssh copy23-1.hpc.itc.rwth-aachen.de

or

ssh copy23-2.hpc.itc.rwth-aachen.de

2. Check if an ssh-key is available in the ssh agent:

ssh-add -l

You can add the ssh-key using the following command or by acivating Agent forwarding:

ssh-add /home/maxmustermann8/.ssh/johndoe_id_ed25519

3. List you data:

rclone lsd rclonetargetname:

4. Copy the data to another cluster (e.g. from the current directory cd $HPCWORK/dir):

rclone copy --transfers 16 --ignore-checksum ./ rclonetargetname:/path/

Be cautious with the parameters: multi-thread-streams and transfers.
If too many simultaneous transfers occur, this can lead to overload of the filesystems. Please test the behavior of the filesystems (source and destination) with a transfer of a smaller amount of data before performing a longer transfer.

Configuration without SSH key for target systems that allow password login

If the remote (non-RWTH) HPC system allows login via password (i.e., no 2FA), then the configuration can be saved encrypted with the password (configuration-encryption). A command for the configuration:

rclone config create rclonetargetname sftp host=hpcstorage.example.de user=johndoe pass=yourpassword --obscure  
# in this case key_use_agent = false should be defined in the configuration
# --obscure only is not save, please use https://rclone.org/docs/#configuration-encryption

Data synchronization between two HPC systems

You can syncronize your data betwenn two HPC system, e.g. bewteen CLAIX and another cluster, as follows.

Listing data:

rclone lsd rclonetargetname:

Synchronizing data:

rclone sync --ignore-checksum ./ rclonetargetname:/path/

zuletzt geändert am 08.10.2025

Wie hat Ihnen dieser Inhalt geholfen?

Creative Commons Lizenzvertrag
Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Deutschland Lizenz