ITS/High Performance Computing Cluster/help on R
[edit] R on the ITS HPC Cluster
R is a language and environment for statistical computing and graphics. It is similar to the S language and environment developed by John Chambers and colleagues at Bell Labs. Much code written for S runs under R. R provides a variety of statistical and graphical techniques, and is highly extensible. Version 2.9.2 of R is installed on the ITS HPC cluster.
To use R on the cluster the R module should be loaded:
module load R
[edit] Interactive use
To use R interactively for a serial job, first request interactive use of a processor in a compute node
qsub -I
and wait until you are connected to a shell in a compute node. (The default length of an interactive session is ten hours.) Then load the R module
module load R
and then invoke R using the shell command "R":
[rab5@quad21 ~]$ R
R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
>
For interactive use of R that requires multiple processors, you need to create a PBS script that requests the desired number of processors. For example, if you want a ten-hour interactive session with four processors the script might look like this:
#PBS -N R_interactive #PBS -l walltime=10:00:00 #PBS -l nodes=1:ppn=4
Specify this PBS script when requesting the interactive node:
qsub -I myscript.pbs
where myscript.pbs is replaced by the name of your PBS script.
After getting connected to the shell on the assigned compute node, load the needed modules:
module unload mpich module load openmpi module load R
Then start R and load the Rmpi package:
>library(Rmpi) >
After exiting from R, remember to also exit from the shell on the compute node that was allocated using the "logout" or "exit" command.
[edit] Batch use
To run an R batch job, prepare a PBS script that contains the command to load the R module. A simple single-processor example is as follows:
#PBS -N R_test #PBS -l walltime=10:00:00 #PBS -l nodes=1:ppn=1 #PBS -j oe # # Load the R module module load R # # cd to the directory where the job was submitted cd $PBS_O_WORKDIR # # Run R R CMD BATCH <R batch.file
In the above sample script, note that 'input.file' should be replaced by the name of your input file containing R commands to be carried out.
To run a multi-processor R batch job, prepare a PBS script that also includes a request for multiple processors and that loads the OpenMPI library:
#PBS -N R_test4
#PBS -l walltime=10:00:00
#PBS -l nodes=1:ppn=4
#PBS -j oe
#
# Load modules
module unload mpich
module load openmpi
module load R
#
# cd to the directory where the job was submitted
cd $PBS_O_WORKDIR
#
echo "My machine will have the following nodes:"
echo "-------------------------------"
cat ${PBS_NODEFILE}
echo "-------------------------------"
#
# Run R
/usr/local/bin/mpirun -np 1 R --slave CMD BATCH <R batch.file
# In the R batch file be sure to load the Rmpi package.
This article is a stub. You can help by adding to it.
| ITS High Performance Cluster Articles | |
|---|---|
| This article is part of the ITS Cluster series of articles | |
| FAQ | Intel compilers | GNU compilers | Portland Group compilers | |
| Intel Math Kernel Library | MINPACK | ScaLAPACK | GSL | FFTW3 | MPICH | NAG | |
| R | Mathematica | NAMD | GROMACS | Amber | MATLAB | |
| FLUENT | GAMESS | Gaussian | MOLCAS | LAMMPS | APBS | |
