ITS/High Performance Computing Cluster/help on Gaussian

Contents

[edit] Using Gaussian G03 On The ITS HPC Cluster

Gaussian G03 is a software package for molecular dynamics. Gaussian G03 is installed for serial (single-processor) use, for parallel (multi-processor, shared memory) use within a single compute node, and for distributed (multi-processor, distributed memory) use using TCP Linda. Release E01 of Gaussian G03 is installed on the cluster.

Due to licensing restrictions, you must make a specific request to its-cluster-admin@case.edu to be able to use Gaussian.

To use release E01 of Gaussian G03 for jobs run directly on the master node or submitted to the PBS batch queue, you must modify the environment using the following command:

   module load gaussian-E01

This command should be typed to the shell to prepare for interactive use. It should be packaged in PBS scripts for batch Gaussian jobs. The modifications to the environment will remain in effect until you logout or until the shell command

   module unload gaussian

is issued.

Except for short debugging runs, all Gaussian jobs should be submitted to the PBS batch queue for execution on cluster compute nodes. In this case, you should load the Gaussian module in the PBS script used to submit the job before the invocation of Gaussian.

To run a Gaussian G03 batch job on the cluster, you need to create a PBS script for it. The sample script below includes a request for one processor in a single compute node and imposes a limit of 10 hours of wall time. Note that the script includes the command to load the Gaussian module as pointed out previously.

  #PBS -N water03
  #PBS -l walltime=10:00:00
  #PBS -l nodes=1:ppn=1:single3800
  #PBS -j oe
  #
  # Load the Gaussian module
  module load gaussian-E01
  #
  # cd to the directory where the job was submitted
  cd $PBS_O_WORKDIR
  #
  # Run Gaussian
  g03 water03.com

The bold portions of this script should be modified as required for the job to be run. note that the script specifies that the Gaussian input file is located in the file "water03.com", which might contain the following lines:

  %chk=water03.chk
  
  #b3lyp/6-311+G(3df,2p) opt freq
 
  Gaussian test file
  
  0 1
  o
  h 1 r
  h 1 r 2 a
  
  r=0.98
  a=109.

As before, the portions of this file in boldface should be modified as needed for other jobs.

[edit] Running a Serial Gaussian Job

To run a serial Gaussian job you must specify in the input file (1) the name of the checkpoint file (.chk); (2) the Route section lines (# commands); (3) the title line; (4) the charge and spin multiplicity line; (5) the molecule specifications in a symbolic Z-matrix, a standard Z-matrix, or Cartesian coordinates; and (6) any additional input needed for your job. [Note that a blank line is required following the input for the molecule specifications; otherwise Gaussian will assume it is a symbolic Z-matrix and an error in link l101.exe may result.] Input files for Gaussian jobs can be created using the GUI, GaussView, on a desktop computer and then uploaded to the cluster. If you create Gaussian input files in a Windows or MS-DOS environment, you should use the "dos2unix" command on the cluster after the transfer to convert the files to the proper text format. If you create the Gaussian input files directly on the cluster, e.g. using the Vi editor, you do not need to perform this step. Comments can be inserted following an exclamation mark.

Here is a sample input file for a serial job contained in the file gau-testjob.com. As before, bold parts should be modified as needed.

  ! Specify the checkpoint file
  %chk=gau-testjob.chk
  ! Specify memory requirement for your job
  %mem=256Mb
  ! Run Hartree-Fock energy with a STO-3G basis set
  # rhf/sto-3g
  ! A blank line follows
  
  ! Now the title of the job
  Test Gaussian serial job using rhf/sto-3g
  ! Another blank line follows
  
  ! Charge and Spin Multiplicity (here neutral charge and a singlet spin, respectively)
  0 1
  O     ! Molecule input (here water in a symbolic Z-matrix)            
  H 1 B1
  H 1 B2 2 A1
  B1    0.96
  B2    0.96
  A1  109.5 
  ! A blank line to terminate the input
  

A suitable PBS batch script, gau-testjob.pbs, to submit this job to the cluster follows with parts to be modify as needed in boldface type.

  #PBS -N gau-testjob
  #PBS -l nodes=1:ppn=1:single3800
  #PBS -l walltime=10:00:00
  #PBS -j oe
  
  # cd to directory from which qsub was run to submit the job
  cd $PBS_O_WORKDIR
  
  # set environment variables for Gaussian
  module load gaussian-E01
  
  # Run Gaussian
  g03 gau-testjob.com

Note that this script requests 3.8 GHz processors as they are the fastest available in the cluster. The next-fastest processors available are 3.2 GHz and can be requested by specifying the "single3200" attribute.

[edit] Running A Parallel Gaussian Job

Because not all Gaussian procedures are written to use parallel processing, it is important to request multiple processors only when they will reduce the processing time for the job. Requesting processors that are not used by a job will tend to increase the time a job waits in the queue for service, decrease overall cluster utilization, and delay other jobs in getting the processors they need. See the Gaussian web site for information on how to specify Gaussian job parameters for most efficient operation.

[edit] Changes Required In The Gaussian Input File

The number of processors using shared memory must be specified in the Gaussian input file. This specification can be made by inserting the following line at the beginning of the Gaussian input file:

  %NProcShared=2

This line specifies that two processors will share memory within the compute node assigned to run the job. As indicated by the bold type, the numbers of processors should be modified as needed for the job to be run.

[edit] Changes Required In The PBS Batch Script

In the PBS batch script that is used to submit the job to the cluster, the number of nodes and processors per node must be specified to match the specification given within the Gaussian input file. This can be done in this example by by the following line:

  #PBS -l nodes=1:ppn=2

where the bold value should be modified as needed. Note that this requests any node that is available. To specifically request the faster 3.8 GHz nodes, the following line could be used:

  #PBS -l nodes=1:ppn=2:single3800

The newer multi-core processors must be selected if more than two processors are desired for the job. For example,

  #PBS -l nodes=1:ppn=4:dual

could be used to select a compute node with two dual-core processors in order to make use of four processors, or

  #PBS -l nodes=1:ppn=8:quad

could be used to select a compute node with two quad-core processors in order to make use of eight processors.

Note that, in general, the multi-core nodes have slower processors than the single-core nodes so that a Gaussian job might actually run more slowly if a multi-core node is requested and the Gaussian job cannot make use of multiple processors.

By not specifying any particular node attribute, such as in

   #PBS -l nodes=1:ppn=4

any type of node can be used if it can supply the required 4 processors. This approach will minimize the job's waiting time in the queue but could result in slower processors being assigned.

[edit] A Complete Sample Parallel Gaussian Job

A sample Gaussian input file, gau-parallel.com, to use two processors in a node follows. The bold parts should be modified as needed.

  %NProcShared=2
  %chk=gau-parallel.chk
  %mem=256Mb
  # rhf/sto-3g
  
  Test parallel Gaussian serial job using rhf/sto-3g
  
  0 1
  O            
  H 1 B1
  H 1 B2 2 A1
  B1    0.96
  B2    0.96
  A1  109.5 
  

A PBS batch script, gau-parallel.pbs, for this job follows. The bold parts should be modified as needed.

  #PBS -N gau-parallel
  #PBS -l nodes=1:ppn=2:mhz3800
  #PBS -l walltime=10:00:00
  #PBS -j oe
  
  cd $PBS_O_WORKDIR
  
  module load gaussian-E01
  
  g03 gau-parallel.com 

[edit] A Complete Sample Distributed Gaussian Job

A sample Gaussian input file, gau-distributed.com, to use two processors in each of four nodes follows. The bold parts should be modified as needed.

  %NProcLinda=4
  %NProcShared=2
  %chk=gau-distributed.chk
  
  #PBS -N gau-distributed
  #PBS -l nodes=4:ppn=2:mhz3800
  #PBS -l walltime=10:00:00
  #PBS -j oe
  
  module load gaussian-E01
  
  g03l gau-distributed.com
  
  module unload gaussian-E01

This article is a stub. You can help by adding to it.

ITS High Performance Cluster Articles
This article is part of the ITS Cluster series of articles
FAQ | Intel compilers | GNU compilers | Portland Group compilers
Intel Math Kernel Library | MINPACK | ScaLAPACK | GSL | FFTW3 | MPICH | NAG
R | Mathematica | NAMD | GROMACS | Amber | MATLAB
FLUENT | GAMESS | Gaussian | MOLCAS | LAMMPS | APBS

Case Referrers

Blog Entries
Other Sites
This page has been accessed 4,107 times.
This page was last modified 22:51, May 9, 2008 by Roger Bielefeld.
About | Disclaimers