You are located in service: RWTH High Performance Computing (Linux)

intelvtune

intelvtune

Kurzinformation

Intel VTune Amplifier is a powerful threading and performance optimization tool for Fortran, C/C++, Java, Assembly and more. It provides both GUI as well as CLI collection and analysis tools and provides a rich set of predefined analysis types. It is also possible to define custom types tailored to one's specific task. The hpc-wiki provides a video tutorial about the Intel VTune profiler.


Detailinformation 

To launch the VTune Amplifier GUI:

> ml VTune
> vtune-gui

Many predefined analysis types are readily available:

  • Basic Hotspots
  • Advanced Hotspots
  • Concurrency
  • Locks and Waits

Those do not require any specific hardware or OS kernel-level support and are available on any machine in the HPC Cluster.

You can run Intel VTune Amplifier in the batch in two ways:

  • start an interactive GUI session (see here) and start the GUI as per the instructions above;
  • perform data collection using the CLI tools in a batch job; after the job has finished, use the GUI to analyze the collected results.

The most convenient way to build the CLI command line is to start by creating the desired analysis project in the amplxe-gui GUI. Once you have chosen the analysis type, binary to run, options and so on, click on Command Line... button in the lower right corner of the window. A popup with the full command line will be displayed.

Note: on SLURM, to activate the support for hardware counters (needed for hardware counter based analysis types), you have to add one of the following option to your batch job:

#SBATCH --hwctr=vtuneperf

This options also sets your job to exclusive mode - mind the resource consumption!  

For further details on how to use VTune Amplifier please contact the HPC team or attend one of our regular workshops.

 

Example batch script for Intel VTune Amplifier with GUI and support for hardware counters

#!/usr/bin/zsh
 
### Job name
#SBATCH --job-name=VTuneGUI-HwC
 
### Request the time you need for execution in minutes
#SBATCH --time=120
 
### Request X11-Forwarding for this job
#TBD:
 
### Request Hardware Counters support for VTune Amplifier
### Note that this also makes your job exclusive
#SBATCH --hwctr=vtuneperf
 
 
### Load the module and execute the GUI
module load intelvtune
amplxe-gui

Example batch script for CLI use of Intel VTune Amplifier with support for hardware counters; separate analysis after job completes

#!/usr/bin/zsh
 
### Job name
#SBATCH --job-name=VTuneCLI-HwC
 
### Request the time you need for execution in minutes
#SBATCH --time=120
 
### Request Hardware Counters support for VTune Amplifier
### Note that this also makes your job exclusive
#SBATCH --hwctr=vtuneperf
 
 
### Load modules and execute
### CLI collection of General Exploration type (with hardware counters)
### binary file 'a.out' in $HOME/test_program for user ab123456
### parameters passed to 'a.out' - '200 1 1'
module load intelvtune
amplxe-cl -collect general-exploration -result-dir my_experiment -app-working-dir /home/ab123456/test_program \
    -- /home/ab123456/test_program/a.out 200 1 1

After the job has finished, results will be available in the my_experiment directory and can be loaded in the GUI for analysis.

 

Example batch script for CLI use of Intel VTune Amplifier with support for MPI and hardware counters; separate analysis after job completes

#!/usr/bin/zsh
  
### Job name
#SBATCH --job-name=MPIVTuneCLI-HwC
  
### Request the time you need for execution in minutes
#SBATCH --time=120
 
### This is a parallel (MPI) batch job on a single node
### NOTE: multi-node jobs are currently not supported!
#SBATCH --nodes=1
#SBATCH --ntasks=8
  
### Request Hardware Counters support for VTune Amplifier
### Note that this also makes your job exclusive
#SBATCH --hwctr=vtuneperf
 
  
### Load modules, check that all kernel module are available, run CLI collection
module load intelvtune
lsmod | grep -e sep -e pax -e vtsspp
$MPIEXEC -l $FLAGS_MPI_BATCH amplxe-cl -trace-mpi -result-dir my_experiment -collect general-exploration -app-working-dir /home/ab123456/test_program \
    -- /home/ab123456/test_program/a.out 200 1 1

NOTE: MPI jobs utilizing multiple nodes are currently not supported.

 

Example batch script for CLI use of Intel VTune Amplifier for analysis of a single rank in unsupported MPI environments; separate analysis after job completes

#!/usr/bin/zsh
 
### Job name
#SBATCH --job-name=MPIVTuneCLI
 
### Request the time you need for execution in minutes
#SBATCH --time=120
 
### This is a parallel (Open MPI) batch job
#SBATCH --ntasks=8
 
 
### Request Hardware Counters support for VTune Amplifier
### Note that this also makes your job exclusive
#SBATCH --hwctr=vtuneperf
  
### Load modules, run CLI collection
module switch intelmpi openmpi
module load intelvtune
cd $HOME/test_program
$MPIEXEC $FLAGS_MPI_BATCH -n 1 amplxe-cl -result-dir my_experiment -collect hotspots a.out 200 1 1 : \
    -n 9 a.out 200 1 1

This runs the collection on rank 0 only.


Zusatzhinweise

Further reading: https://software.intel.com/en-us/blogs/2015/05/26/how-to-profile-mpi-processes-on-all-nodes

last changed on 09/09/2024

How did this content help you?

Creative Commons Lizenzvertrag
This work is licensed under a Creative Commons Attribution - Share Alike 3.0 Germany License