Jupyter Notebook
Note: This method is deprecated! It is no longer supported by us and remains here only for backwards compatibility of the documentation!
Please use the HPC JupyterHub instead!
For general Python remarks see this page, for TensorFlow as a representative of Python-based AI/DL codes see this page.
- 1. Installation
- 2. Starting the Notebook Server in a batch job
- 3. Connecting to the Notebook Server
- 4. Browsing the Notebook
- 5. Starting the Notebook Server on interactive front ends
- 6. Starting the Notebook Server in a batch job with SSH tunnel
- 7. General information
Jupyter Notebook is a web-based application for capturing the whole computation process: developing, documenting, and executing code, as well as communicating the results.
A Jupyter notebook combines two components:
- A web application: a browser-based tool for interactive authoring of documents which combine explanatory text, mathematics, computations and their rich media output.
- Notebook documents: a representation of all content visible in the web application, including inputs and outputs of the computations, explanatory text, mathematics, images, and rich media representations of objects.
The easiest way to install Juputer Notebook software in your $HOME in order to use it on the computers in the HPC Cluster is via 'pip3'.
|
example output:
|
Depending on how old your Python (and installed Python software) is (are) you could need one or more of the flags "--upgrade --force-reinstall --no-cache-dir"
. Remember that they could break your existing software chains - do not start without a back up!
2. Starting the Notebook Server in a batch job
Random Ports |
![]() |
Here is a template for submitting a jupyter-notebook server as a slurm batch job. You may need to edit some of the slurm options, including the time limit. Save your edited version of this script on the cluster, and submit it with sbatch jobscript.sh
#!/usr/local_rwth/bin/zsh #SBATCH --ntasks-per-node=1 # Load the same python as used for installation # get tunneling info # start Juputer Notebook. Remember: it will be active during the job run time only! |
3. Connecting to the Notebook Server
Once your submitted batch job starts, your notebook server will be ready for you to connect. You can run squeue -u$(whoami) to check the status of (all of) your job(s). You will see an "R" in the ST or status column for your notebook job if it is running. If you see a "PD" in the status column, you will have to wait for your job to start running to connect. The log file with information about how to connect will be in the directory you submitted the script from, and be named jupyter-notebook-[jobid].log where jobid is the slurm id for your job.
Finally, open a web browser on login node and enter the address of your jupyter notebook, from the last lines of the log file. This link will look something like
http://ncm0000:9230/?token=b7fb26d6291d970498a5a4d5b2c77cc40ba41acee5f22376
After opening this link you will land on a wep page showiing the content of the directory Jupyter job started in, and able to open a new Notebook or Terminal.
example screenshots
5. Starting the Notebook Server on interactive front ends
In above example the Notebook Server run within of a batch job. This is advantageus if you need more-that-trivial amount on compute ressources (e.g. access to GPUs, or many cores to compute something in parallel). If You need neglible amount of ressources - this is the case all the time you edit your program, think and contemplate, or drink some coffee, in short: all the time your notebook does not run - a batch job is overkill or even waste of ressources (in case of manycore batch jobs). Whenever you are using Notebook for interactive development not using much of ressources you are welcome to start your jupyter Notebook direct on interactive nodes (no wait time for scheduling of the batch job).For example you can just save the above example script in a file and execute it (in a 'screen' session?).
However remember that the general rules for ressource usage on interactive nodes apply: do not run long-term, and/or manycore runs. Do not start Jupyter Notebook on GPU front ends, as any import of GPU-capable AI/DL frameworks like TensorFlow will likely lead to locking of the GPUs - and the kill signal to the notebook after a quite short time.
6. Starting the Notebook Server in a batch job with SSH tunnel
This job template will start a batch job and create a ssh tunnel from the submitting host to the compute node. You can work on your local host using a local browser. You may need to edit some of the slurm options, including the time limit. Save your edited version of this script on the cluster, and submit it with sbatch jobscript.sh
#!/usr/local_rwth/bin/zsh #SBATCH --nodes=1 echo "------------------------------------------------------------" # Load the same python as used for installation # set a random port for the notebook, in case multiple notebooks are # set a random port for tunneling, in case multiple connections are happening # set a random access token echo "On your local machine, run:" # Set up a reverse SSH tunnel from the compute node back to the submitting host (login01 or login02) # Start the notebook |
Jupyter Working Directory |
By default Jupyter opens a view to the submit directory of the job. jupyter-notebook --no-browser --port=${port} --ip=${node} --notebook-dir /path/to/directory |