1. How to Access the Software
module load CHEMISTRY
module load gaussian
2. Setting Limits
The most important point about the proper usage of gaussian on the cluster is that the limit for the memory, as given by #BSUB -M [value in MB] in the LSF batch job script, has to be higher than the memory limit set within the gaussian input file with the directive %mem, since gaussian itself also needs memory.
It is very important that you define the maximum amount of hard disk space the program is supposed to use via the keyword MaxDisk (e. g. MaxDisk=30GB) in the route section of your gaussian input file!
(Annot.:The default disk space limit defined within the Default. Route file is generally too low (less than 2 GB), but fixed to this value for security reasons)
In Link 906 hard disk space is used in an excessive manner for writing down the integral data into the rwf file if the link decides automatically to switch to the disk based calculation method.
In order to alleviate this problem, you should write down the option word FullDirect in brackets after the relevant method keyword MPn (n = 2,3,4) in the route section of your input file.
(Annot.: If and only if the value for the physical main memory passed to gaussian via %mem is large enough (you have to test this on your own), link 906 decides to run effectively in main memory by recalculating the integrals as needed without saving the whole integral data in a big rwf file on the hard disk.)
3. Example Batch Script
### Job name
### File / Path which STDOUT will be written to, %J is the job ID
### Request the time you need for execution. The full format is D-HH:MM:SS
### You must at least specify minutes or days and hours and may add or
### leave out any other parameters
# This corresponds to the number of processors (no hyperthreading possible)
# to use with gaussian as set via %NProcShared=[number_of_threads] in the
# gaussian input file (a number between 4 and 12 should be reasonable)
### Request the memory you need for your job. You can specify this
### in either MB (1024M) or GB (4G). BEWARE: This is a per-cpu limit,
### and will be multiplied with cpus-per-task for the total requested memory
###### end of batch directives ######
###### start of shell commands ######
# load the necessary module files
module load CHEMISTRY
module load gaussian
# execute the gaussian binary
g09 < [name_of_inputfile] > [name_of_outputfile]
Scaling and speedup of Gaussian on BCS hosts
How-to to speed-up exclusive(!) calculations on BCS host:
1) File the data for the rwf file during the calculation on the shared memory device (mounted on each machine under /dev/shm) by declaring filepath within the gaussian input file with the the Link 0 directive
This shm device is available on each SMP machine and amounts roughly half of the physical memory installed on the RAM bank(s) of the machine. The latency time accessing this device is in the order of the latency time used to access main memory.
2) Do not forget to copy back the rwf file (manually) to one of your $HOME NFS-folders after having done your calculation stuff.
This method ensures that you avoid frequent and slow read/write operations to the hard disk and that your calculation gets a speedup in the order of 1.3 to 1.5 times of the original calculation time.
PLEASE only use this method for job running exclusively(!) on an SMP machine.
Otherwise you risk to crash jobs of other people if they also try to access the shared memory device (gaussian writes rather big (temporary) rwf files of several hundred GiB or some TiB depending on your calculation type).
Gaussian has been compiled with the PGI compiler, which by default limits the number of threads to 64.
By adding the environment variable OMP_THREAD_LIMIT=128 this can be overwritten.
But you should make sure that Gaussian really profits from that many threads. To our experience 32 threads is a good number for large calculations. But we would be curious to learn more about you experiences with the scalability of Gaussian.
Problems starting gaussview
You encountered problems starting gaussview cause of missing qt library dependencies or Open GL support?
You may alleviate this problem by passing the switch -mesagl to the gview script. So, invoke gaussview actually by typing:
Problem with opt-freq multi-step jobs in gaussian09
The following example demonstrates a serious drawback provoked by the automatic facility of opt-freq chain jobs offered by gaussian automatics.
- Description of problem case: This article deals with the technique of opt-freq chain-jobs, a common job type of gaussian calculations.
The route section and the Link0 command section of the first job step, namely the geometry optimization, is set up manually by the user.
The maximum physical memory to be used by the whole bunch of gaussian processes is limited to 8GiB and the number of OpenMP threads to span by each link is fixed to 12.
The name of the chk-file is given als benzene.chk (the .chk extension is automatically added by gaussian).
Given the content of the route card as written down in the input file:
#P MP2=(Direct)/6-311++G** SCF=(Direct) Opt Freq MaxDisk=40GB (1)
This section (1) defines with the combination of Opt and Freq the calculation/job type of a geometry opt automatically followed by a frequency calc. The calculation method is set up to be a second order Moeller-Plesset like with an TZVP HF-basis-set.
It's clearly stated that the amount of disk space to be allocated at maximum by all gaussian processes together is restricted to 40GB.
During the optimization run the disk space quorum is gracefully honoured by gaussian and the program prints the correct amount of available disk space to be fixed at 40GB.
And now, let's jump to the second job step automatically set up by gaussian. The route section for the second job step, namely the frequency calculation is auto-generated by gaussian and looks like this:
#P Geom=AllCheck Guess=TCheck SCRF=Check GenChk RMP2(FC)/6-311++G(d,p) Freq (2)
And woops, the information about disk space quorum is lost and the amount of available disk space is no longer fixed to be 40GiB!
Instead of the quorum defined by the user within the manually set up route section (1), gaussian forgets to write down the disk space quorum in the title section (2) of the second job and simply grabs the default value for the maximum disk space from the Default.Route file from the Gaussian installation path. But this config file restricts the maximal value to be fixed at about 0.19GiB! This is a definitively insufficient amount of disk space for the rwf-file of our test case with about 4.1GiB at the end of the whole calculation run.
- Possible solutions: You are able to alleviate this problem by choosing out of two possible workarounds:
- Doing the frequency calculation in a separate gaussian run with a new input file defining the disk space limit again (laborious)
Put the two jobs together in one input file by separating the two jobs (optimization and frequency calc) with the multi-step directive --Link1-- in front of the section with the Link0 commands of the second job. The relevant part of the input file should read as follows:
[final newline of first job]
- but the most convenient workaround one is the following:
Create a file named Default.Route in the working directory of your job containing the following line:
gaussview cannot open chkpointfile
If you get a parse error, if you try to open a chkpointfile, like this:
Missing or bad data: Alpha Orbital Energies
Line Number xxx
it might be, that the number of independent functions differs from the number of basis functions, since gaussian eliminated some dependent functions. This gives gaussview a big headache, but you are able to help it out.
Just change the aaa in line
Number of basis functions I aaa
to number bbb of line
Number of independent functions I bbb