ITS/High Performance Computing Cluster/help on MATLAB

Contents

[edit] Using MATLAB on the ITS HPC Cluster

MATLAB is a software package for general mathematical work. MATLAB is installed for both serial (single-processor) and distributed (multi-processor) use. Version R2009a of MATLAB is the latest version and the default version that is installed on the cluster.


[edit] Using MATLAB Interactively or Submitting Jobs via Batch Script

MATLAB can be run interactively on the compute nodes (NOT the Master nodes) for testing, debugging, and production runs. The compute node is reserved via PBS batch queue using the "qsub -I -X" command.

Another efficient way to run MATLAB code is to run the job via the PBS batch submission process. The batch script will contain matlab -nodisplay to prevent MATLAB from spawning interactive display.

More details are given below.

[edit] Single-processor interactive use

Single-processor interactive MATLAB jobs should be run on the compute node. The compute node is reserved via PBS batch queue using the "qsub -I -X" command. For single-processor interactive MATLAB jobs, you must load the matlab module.

   qsub -I -X
   module load matlab

Once the module is loaded, MATLAB can be started by giving the shell command

   matlab


[edit] Single-processor batch use

Single-processor MATLAB batch jobs require that you write a PBS batch script and submit the script to the PBS batch queue for execution. Within the batch script, you need to load the "matlab" module with the command

   module load matlab

so that the PATH and other environment variables are set properly in the batch job to run MATLAB.

Following is a PBS script that can be used to run a serial MATLAB batch job on the cluster. The sample script includes a request for one processor and 10 hours of wall time. Note that the script includes the command to load the MATLAB worker module as mentioned above and specifies the "-nodisplay" option. The script specifies that the actual MATLAB program (or "M-file") to be run is contained in the file "MyJob.m".

  #PBS -N matlab_test
  #PBS -l walltime=10:00:00
  #PBS -l nodes=1:ppn=1
  #PBS -j oe
  # cd to the directory where the job was submitted
  cp $PBS_O_WORKDIR/* $TMPDIR
  cd $TMPDIR
  # Load MATLAB module
  module load matlab 
  # Run MATLAB
  matlab -nodisplay -r MyJob
  # Copy back to your directory
  cp -ru * $PBS_O_WORKDIR


This script is submitted to the batch queue using the command

  qsub matlab_test.pbs

where matlab_test.pbs is the name of the file containing the script.

The PBS batch script given above, with appropriate minor modifications, can be used to submit a wide variety of MATLAB jobs. For example, if the following sample MATLAB M-file (taken from the Western Canada Research Grid) is placed in a file named input.m, you can run the job using this script. The Postscript output will be put into the file matlab_test_plot.ps. You can send this file to a Postscript printer or view it on-screen using a Postscript viewer like kview in the HPC cluster.

  % MATLAB M-file example to approximate a sawtooth
  % with a truncated Fourier expansion.
  %
  nterms=5;
  fourbypi=4.0/pi;
  np=100;
  y(1:np)=pi/2.0;
  x(1:np)=linspace(-2.0*pi,2*pi,np);
  for k=1:nterms
     twokm=2*k-1;
     y=y-fourbypi*cos(twokm*x)/twokm^2;
  end;
  plot(x,y);
  print -deps matlab_test_plot.ps;
  quit;

[edit] Multi-Processor Distributed Batch Use

Multi-processor, distributed MATLAB makes use of the MATLAB Distributed Computing Engine (MDCE) on the cluster. The use of the combination of MDCE and PBS causes the process of running distributed MATLAB jobs to be a bit more complex than in the serial case discussed previously.

An overview of how distributed MATLAB jobs run using MDCE and PBS is helpful. In the distributed case, the user usually writes the MATLAB program as an M-file that delegates computational tasks to other "worker" M-files and collects the results they generate for possible additional processing. The user edits a configuration file that enables MATLAB to communicate with the Torque scheduler on the cluster. Then the m-file is run; MATLAB builds additional PBS scripts and submits them to the queue in order to request additional compute nodes. These additional nodes run the "worker" M-files in MATLAB kernels executing in those nodes. These instances of MATLAB are known as "workers". Typically, the first M-file waits for all the workers to complete their processing in order to obtain the results they generate. Then the first M-file may carry out additional processing tasks using those results and produce output according to the needs of the user.


[edit] Using MATLAB’s built in torque scheduler

This section is intended for those who have a functional X-server (can display gui when needed; in this case, when matlab is run, it is run from its gui, not the terminal) and are running their jobs from the cluster. For running jobs from a remote machine to the cluster, see the next section. In order to get an X-server on your machine, install a program like Xming. First you must import our clusters configuration for matlab to communicate with the cluster. First run the commands:

  cd
  wget http://hpcc1-mgmtnode2.case.edu/matlab/2009aTorqueConfig.mat
  module load matlab
  matlab

This should run matlab R2009a, which is currently running off of the master node in the cluster. We are going to allow it to communicate with the worker installation so that jobs aren’t run on the master node. N.B. the master node is already experiencing extreme loads. Please keep all pre-run testing to your own local machine.

At the top of the matlab window, you should see a parallel tab. Select Parallel > manage configurations. In the configuration window that has been brought up, select file > import and select the file that you saved to your home directory, 2009aTorqueConfig.mat. Now right-click on this new configuration. We are going to edit one field. You’ll need to edit the Data Location field to represent your own local data location. For example: /home/jxm343/matlab/. Once you’ve modified this, click ok to confirm your edits to your own local configuration. Now to make sure it is working properly, click the start validation button in the lower right hand corner of the manage configurations window. This will run a series of tests in order to make sure your configurations are working properly. As long as a check mark appears, you are properly configured, which should happen as long as you only edit the data location field. Click the circle next to this configuration to set it as default (to be used by matlab) then close the window.

Now in order to run a distributed or parallel job, you will have to create an m file that runs your job. This is a straightforward process that gathers all the information it needs from our configuration file, as long as the configuration is selected (if it is not currently selected go to parallel>select configuration>2009aTorqueConfig). The m-file will look like the following:

  sched = findResource;
  j = createJob(sched);
  for i = 1:5
     j.createTask(@myTask, 1, {input parameters for your function go here});
  end
  submit(j);
  waitForState(j, ‘finished’);
  for i=1:length(j.Tasks)
     j.Tasks(i).OutputArguments{1};
  end
  destroy(j);

Only the bold NEEDS to be edited, where myTask is another m file which you are trying to run. You can change the number of iterations of the first for loop depending on how many tasks you want to create and you can save your output in whatever manner you like by editing the for loop after waitForState(j, ‘finished’). Remember this is just an example.

NB destroying your job is crucial. When a job is created the cluster allocates resources to it, and remain idle until the job is destroyed. Even if you get error in your code, it is important, if not more important, to run this line as the error might prevent the code from reaching the destroy job line. Always make sure destroy(j) is run in order to free resources. To verify that there is nothing running on your allocated resources, query sched, by typing sched into matlabs command line. This should output that no tasks are finished, pending, queued or otherwise. Also with this method, after submitting a job, it will not tell you what the number of that job is. But you should be able to go onto the cluster and use the command qstat -u [username] to check your job.

[edit] Matlab Parallel Computing

Using the parallel computing on the cluster and matlab has been simplified by converting a simple for-loop so that it runs in parallel. An example would be to plot a sin wave and plot the wave form , a simple for-loop would run its iterations like this:

clear A;
for i = 1:1024
   A(i) = sin (i*2*pi/1024);
end
plot(A);

To interactively run code that contains a parallel loop, you first open a MATLAB® pool. This reserves a collection of MATLAB worker sessions to run your loop iterations. The MATLAB pool can consist of MATLAB sessions running on your local machine or on a remote cluster:

matlabpool open;

With the MATLAB pool reserved, you can modify your code to run your loop in parallel by using a parfor statement:

clear A;
parfor i=1:1024
A(i) = sin(i*2*pi/1024);
end
plot(A);


The Only difference in this loop is the keyword parfor instead of for. the results are always the same but the difference is the parfor runs a single iteration on different workers

Because the iterations run in parallel in other MATLAB sessions, each iteration must be completely independent of all other iterations. The worker calculating the value for A(100) might not be the same worker calculating A(500). There is no guaranty of sequence, so A(900) might be calculated before A(400). (The MATLAB Editor can help identify some problems with parfor code that might not contain independent iterations.) The only place where the values of all the elements of the array A are available is in the MATLAB client, after the data returns from the MATLAB workers and the loop completes


Finally, you should release the workers by using the

Matlabpool close;


[edit] Transparency

Matlab forbids the initialization of variables outside of the main text (code). i.e. The code or body of a parfor-loop must be transparent, meaning that all references to variables must be "visible" to that loop.

Example

X = 5;
 parfor i = 1:4
   eval('X');
 end
  %This would not work because the at the execution of the eval, X is seen as a string and 
  %therefore, an error is given at run time. 

[edit] Slicing Arrays

When the parfor loop is called and an initial variable is initilized then called in the loop, the parfor loop passed the loop variable to each worker that is working on that iteration. However, if all occurrences of the variable are indexed by the loop variable, each worker receives only the part of the array it needs. For more Information Click here

[edit] Monte Carlo Simulation Example

Another example that is slightly more complicated is the monte carlo simulation for flipping coins.

function fixedmonteCarlo
%  This example involves running a 
%  monte-carlo simulation that produces 
%  an integer from 1 to 20.  We want to
%  display a histogram of the result.
   N = 500000;
   nCoins = 20;
   numHeads = zeros(1, N);
 tic
 parfor (simNum = 1:N)
   numHeads(simNum) = flipCoins(nCoins);
 end
   changeHeads=sign(diff(numHeads));
 toc
 x=(0:20);
 figure, hist(numHeads,x);
 figure, hist(changeHeads,[-1,0,1])
 print -deps matlab_test_plot.ps;
 
 function counter = flipCoins(nCoins)
 % Simulate the flipping of 20 coins and count the number of heads 
 counter = sum(rand(1, nCoins) > .5);

enter this code into an fixedmonteCarlo.m file and use a pbs script montecarlotest.pbs

 #PBS -N myJob
 #PBS -l walltime=02:00:00
 #PBS -l nodes=4:ppn=2
 #PBS -o log.log
 #PBS -j oe
 
 # copy file to temporary directory
 cp $PBS_O_WORKDIR/* $TMPDIR

 # cd to temporary direcotry
 cd $TMPDIR
 
 # Load the MATLAB module
 module load matlab_R2008a
 
 # Run MATLAB
 matlab -nodisplay -r fixedmonteCarlo
 #writes current log file information
 echo Master process running on `hostname`
 echo Directory is `pwd` 
 echo PBS has allocated the following nodes:
 echo `cat $PBS_NODEFILE` 
 NPROCS=`wc -l < $PBS_NODEFILE`
 echo This job has allocated $NPROCS nodes
 echo `date`
 echo -------------------------------------------------------------------------------
 
# Copy everything back to the working directory
  cp -ru * $PBS_O_WORKDIR

using the qsub montecarlotest.pbs to submit the job on the cluster.

A plot on matlab_test_plot.ps would be created and can be viewed using a Postscript viewer like kview in HPC cluster].

[edit] Important Information

For more important information about running parallel jobs on Matlab Click here

[edit] Running MATLAB jobs on the cluster from a remote machine

Our current MATLAB configuration allows distributed jobs to be submitted from a remote machine that has MATLAB installed. By setting up an ssh client to connect without a password, and importing our configuration for the cluster, distributed jobs should find their way to the cluster and send the results back to the remote machine.

These following steps assume the following: You are running a windows machine. You have a client license for matlab with the parallel computing toolbox(available at softwarecenter.case.edu)

Case's licenses include the parallel computing toolbox for matlab so it or a separate license should be installed on your windows machine.

Please see step-by-step how-to on running Matlab jobs on the cluster from a remote machine: Step-by-step how-to

[edit] Running Distributed Jobs

Definition for Worker as a set to run the job:

WORKER-

Worker session that performed task
 
Description
The Worker property value is an object representing the worker session that evaluated the task.
Characteristics
Usage       Task object
Read-only   Always
Data type   Worker object
Values
Before a task is evaluated, its Worker property value is an empty vector.
Examples
Find out which worker evaluated a particular task.
submit(job1)
waitForState(job1,'finished')
t1 = findTask(job1,'ID',1)
t1.Worker.Name
ans =
node55_worker1

[edit] Using Other Versions of MATLAB

In the description above, the module names "matlab" and "matlab_worker" are actually synonyms for "matlab_R2009a" and "matlab_R2009a_worker". There are older module files for older versions of MATLAB. The presence of module files for older versions of MATLAB suggests, but does not guarantee, that the corresponding older version is still installed and usable. These older versions are unsupported but may be useful for running old scripts that would required updating to run using the current version of MATLAB.

Generally, two versions of MATLAB are released each year, an "a" version in the spring and a "b" version in the fall. We try to install new versions shortly after they are released.

[edit] For More On Parallel Computing

Users Guide Site

Users Guide (PDF)

--Mahmoud.Audu 09:55, June 30, 2008 (EDT) --James.Munch 9:30, June 29, 2009 (EDT)


This article is a stub. You can help by adding to it.

ITS High Performance Cluster Articles
This article is part of the ITS Cluster series of articles
FAQ | Intel compilers | GNU compilers | Portland Group compilers
Intel Math Kernel Library | MINPACK | ScaLAPACK | GSL | FFTW3 | MPICH | NAG
R | Mathematica | NAMD | GROMACS | Amber | MATLAB
FLUENT | GAMESS | Gaussian | MOLCAS | LAMMPS | APBS

Case Referrers

Other Sites
This page has been accessed 9,321 times.
This page was last modified 15:08, October 6, 2009 by Hadrian Djohari.
About | Disclaimers